TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Atari Games	Atari-57	LASER	Human World Record Breakthrough	7	# 7
Atari Games	Atari-57	LASER	Mean Human Normalized Score	1741.36%	# 7
Atari Games	Atari games	LASER	Mean Human Normalized Score	1741.36%	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/off-policy-actor-critic-with-shared-1/atari-games-on-atari-57)](https://paperswithcode.com/sota/atari-games-on-atari-57?p=off-policy-actor-critic-with-shared-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/off-policy-actor-critic-with-shared-1/atari-games-on-atari-games)](https://paperswithcode.com/sota/atari-games-on-atari-games?p=off-policy-actor-critic-with-shared-1)`

Off-Policy Actor-Critic with Shared Experience Replay

ICML 2020 · Simon Schmitt, Matteo Hessel, Karen Simonyan ·

We investigate the combination of actor-critic reinforcement learning algorithms with uniform large-scale experience replay and propose solutions for two challenges: (a) efficient actor-critic learning with experience replay (b) stability of off-policy learning where agents learn from other agents behaviour. We employ those insights to accelerate hyper-parameter sweeps in which all participating agents run concurrently and share their experience via a common replay module. To this end we analyze the bias-variance tradeoffs in V-trace, a form of importance sampling for actor-critic methods. Based on our analysis, we then argue for mixing experience sampled from replay with on-policy experience, and propose a new trust region scheme that scales effectively to data distributions where V-trace becomes unstable. We provide extensive empirical validation of the proposed solution. We further show the benefits of this setup by demonstrating state-of-the-art data efficiency on Atari among agents trained up until 200M environment frames.

PDF Abstract ICML 2020 PDF

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Atari Games

Datasets

Arcade Learning Environment

DQN Replay Dataset

Results from the Paper

Edit

Ranked #6 on Atari Games on Atari-57

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Atari Games	Atari-57	LASER	Human World Record Breakthrough	7	# 7	Compare
Atari Games	Atari-57	LASER	Mean Human Normalized Score	1741.36%	# 7	Compare
Atari Games	Atari games	LASER	Mean Human Normalized Score	1741.36%	# 8	Compare

Methods

Add Remove

Experience Replay • V-trace

Edit Social Preview

Off-Policy Actor-Critic with Shared Experience Replay

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove