Policy Gradient RL Algorithms as Directed Acyclic Graphs

14 Dec 2020 Juan Jose Garau Luis

Meta Reinforcement Learning (RL) methods focus on automating the design of RL algorithms that generalize to a wide range of environments. The framework introduced in (Anonymous, 2020) addresses the problem by representing different RL algorithms as Directed Acyclic Graphs (DAGs), and using an evolutionary meta learner to modify these graphs and find good agent update rules... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Dilated Convolution
Convolutions
Global Average Pooling
Pooling Operations
Average Pooling
Pooling Operations
1x1 Convolution
Convolutions
SAC
Convolutions
Batch Normalization
Normalization
Adam
Stochastic Optimization
Experience Replay
Replay Memory
ReLU
Activation Functions
Convolution
Convolutions
Q-Learning
Off-Policy TD Control
Clipped Double Q-learning
Off-Policy TD Control
Target Policy Smoothing
Regularization
TD3
Policy Gradient Methods
Weight Decay
Regularization
Dense Connections
Feedforward Networks
Entropy Regularization
Regularization
DDPG
Policy Gradient Methods
DQN
Q-Learning Networks
PPO
Policy Gradient Methods