PGPS : Coupling Policy Gradient with Population-based Search

1 Jan 2021 Anonymous

Gradient-based policy search algorithms (such as PPO, SAC or TD3) in deep reinforcement learning (DRL) have shown successful results on a range of challenging control tasks. However, they often suffer from flat or deceptive gradient problems... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Dilated Convolution
Convolutions
Global Average Pooling
Pooling Operations
Average Pooling
Pooling Operations
Convolution
Convolutions
1x1 Convolution
Convolutions
SAC
Convolutions
Entropy Regularization
Regularization
Experience Replay
Replay Memory
Target Policy Smoothing
Regularization
Dense Connections
Feedforward Networks
PPO
Policy Gradient Methods
Clipped Double Q-learning
Off-Policy TD Control
ReLU
Activation Functions
Adam
Stochastic Optimization
TD3
Policy Gradient Methods