Search Results for author: Haoxuan Pan

Found 1 papers, 0 papers with code

Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning

no code implementations • 20 Jan 2023 • Haoxuan Pan, Deheng Ye, Xiaoming Duan, Qiang Fu, Wei Yang, Jianping He, Mingfei Sun

We show that, despite such state distribution shift, the policy gradient estimation bias can be reduced in the following three ways: 1) a small learning rate; 2) an adaptive-learning-rate-based optimizer; and 3) KL regularization.

Continuous Control reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.