OffCon$^3$: What is state of the art anyway?

27 Jan 2021 Philip J. Ball Stephen J. Roberts

Two popular approaches to model-free continuous control tasks are SAC and TD3. At first glance these approaches seem rather different; SAC aims to solve the entropy-augmented MDP by minimising the KL-divergence between a stochastic proposal policy and a hypotheical energy-basd soft Q-function policy, whereas TD3 is derived from DPG, which uses a deterministic policy to perform policy gradient ascent along the value function... (read more)

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Dilated Convolution
Convolutions
Global Average Pooling
Pooling Operations
Average Pooling
Pooling Operations
Convolution
Convolutions
1x1 Convolution
Convolutions
SAC
Convolutions
Dense Connections
Feedforward Networks
Adam
Stochastic Optimization
Target Policy Smoothing
Regularization
Experience Replay
Replay Memory
Clipped Double Q-learning
Off-Policy TD Control
ReLU
Activation Functions
TD3
Policy Gradient Methods
DPG
Policy Gradient Methods