Exploring Model-based Planning with Policy Networks

Model-based reinforcement learning (MBRL) with model-predictive control or online planning has shown great potential for locomotion control tasks in terms of both sample efficiency and asymptotic performance. Despite their initial successes, the existing planning methods search from candidate sequences randomly generated in the action space, which is inefficient in complex high-dimensional environments... (read more)

PDF Abstract ICLR 2020 PDF ICLR 2020 Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Experience Replay
Replay Memory
Dense Connections
Feedforward Networks
ReLU
Activation Functions
Target Policy Smoothing
Regularization
Clipped Double Q-learning
Off-Policy TD Control
Adam
Stochastic Optimization
TD3
Policy Gradient Methods