TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Atari Games	Atari 2600 Freeway	A3C-CTS	Score	30.48	# 32
Atari Games	Atari 2600 Gravitar	A3C-CTS	Score	238.68	# 49
Atari Games	Atari 2600 Montezuma's Revenge	DDQN-PC	Score	3459	# 10
Atari Games	Atari 2600 Montezuma's Revenge	A3C-CTS	Score	273.7	# 23
Atari Games	Atari 2600 Private Eye	A3C-CTS	Score	99.32	# 46
Atari Games	Atari 2600 Venture	A3C-CTS	Score	0.0	# 49

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unifying-count-based-exploration-and/atari-games-on-atari-2600-montezumas-revenge)](https://paperswithcode.com/sota/atari-games-on-atari-2600-montezumas-revenge?p=unifying-count-based-exploration-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unifying-count-based-exploration-and/atari-games-on-atari-2600-freeway)](https://paperswithcode.com/sota/atari-games-on-atari-2600-freeway?p=unifying-count-based-exploration-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unifying-count-based-exploration-and/atari-games-on-atari-2600-private-eye)](https://paperswithcode.com/sota/atari-games-on-atari-2600-private-eye?p=unifying-count-based-exploration-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unifying-count-based-exploration-and/atari-games-on-atari-2600-gravitar)](https://paperswithcode.com/sota/atari-games-on-atari-2600-gravitar?p=unifying-count-based-exploration-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/unifying-count-based-exploration-and/atari-games-on-atari-2600-venture)](https://paperswithcode.com/sota/atari-games-on-atari-2600-venture?p=unifying-count-based-exploration-and)`

Unifying Count-Based Exploration and Intrinsic Motivation

NeurIPS 2016 · Marc G. Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton, Remi Munos ·

We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations. Specifically, we focus on the problem of exploration in non-tabular reinforcement learning. Drawing inspiration from the intrinsic motivation literature, we use density models to measure uncertainty, and propose a novel algorithm for deriving a pseudo-count from an arbitrary density model. This technique enables us to generalize count-based exploration algorithms to the non-tabular case. We apply our ideas to Atari 2600 games, providing sensible pseudo-counts from raw pixels. We transform these pseudo-counts into intrinsic rewards and obtain significantly improved exploration in a number of hard games, including the infamously difficult Montezuma's Revenge.

PDF Abstract NeurIPS 2016 PDF NeurIPS 2016 Abstract

Code

Add Remove Mark official

RLAgent/state-marginal-matching

Tasks

Add Remove

Atari Games

Montezuma's Revenge

reinforcement-learning

Reinforcement Learning (RL)

Datasets

Arcade Learning Environment

Results from the Paper

Edit

Ranked #10 on Atari Games on Atari 2600 Montezuma's Revenge

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Atari Games	Atari 2600 Freeway	A3C-CTS	Score	30.48	# 32	Compare
Atari Games	Atari 2600 Gravitar	A3C-CTS	Score	238.68	# 49	Compare
Atari Games	Atari 2600 Montezuma's Revenge	DDQN-PC	Score	3459	# 10	Compare
Atari Games	Atari 2600 Montezuma's Revenge	A3C-CTS	Score	273.7	# 23	Compare
Atari Games	Atari 2600 Private Eye	A3C-CTS	Score	99.32	# 46	Compare
Atari Games	Atari 2600 Venture	A3C-CTS	Score	0.0	# 49	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Unifying Count-Based Exploration and Intrinsic Motivation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove