A Self-Tuning Actor-Critic Algorithm

Reinforcement learning algorithms are highly sensitive to the choice of hyperparameters, typically requiring significant manual effort to identify hyperparameters that perform well on a new domain. In this paper, we take a step towards addressing this issue by using metagradients to automatically adapt hyperparameters online by meta-gradient descent (Xu et al., 2018)... (read more)

PDF Abstract NeurIPS 2020 PDF NeurIPS 2020 Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Sigmoid Activation
Activation Functions
Tanh Activation
Activation Functions
Experience Replay
Replay Memory
Entropy Regularization
Regularization
Residual Connection
Skip Connections
Gradient Clipping
Optimization
RMSProp
Stochastic Optimization
ReLU
Activation Functions
Max Pooling
Pooling Operations
Convolution
Convolutions
LSTM
Recurrent Neural Networks
V-trace
Value Function Estimation
IMPALA
Policy Gradient Methods