Truly Proximal Policy Optimization

19 Mar 2019 Yuhui Wang Hao He Chao Wen Xiaoyang Tan

Proximal policy optimization (PPO) is one of the most successful deep reinforcement-learning methods, achieving state-of-the-art performance across a wide range of challenging tasks. However, its optimization behavior is still far from being fully understood... (read more)

PDF Abstract

Tasks


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Entropy Regularization
Regularization
PPO
Policy Gradient Methods