Monte Carlo Tree Search (MCTS) has achieved impressive results on a range of discrete environments, such as Go, Mario and Arcade games, but it has not yet fulfilled its true potential in continuous domains.In this work, we introduceTPO, a tree search based policy optimization method for continuous environments. TPO takes a hybrid approach to policy optimization... (read more)
PDF Abstract