Montezuma's Revenge

28 papers with code • 1 benchmarks • 1 datasets

Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.

For the state-of-the art tables, please consult the parent Atari Games task.

( Image credit: Q-map )

Most implemented papers

Playing Atari Games with Deep Reinforcement Learning and Human Checkpoint Replay

smj007/Breakout_A3C 18 Jul 2016

This paper introduces a novel method for learning how to play the most difficult Atari 2600 games from the Arcade Learning Environment using deep reinforcement learning.

Count-Based Exploration with Neural Density Models

nolisten/erl ICML 2017

This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge.

Beating Atari with Natural Language Guided Reinforcement Learning

deniztekalp/comp-491-bitirme 18 Apr 2017

We introduce the first deep reinforcement learning agent that learns to beat Atari games with the aid of natural language instructions.

Feature Control as Intrinsic Motivation for Hierarchical Reinforcement Learning

lyebi/Test 18 May 2017

We highlight the advantage of our approach in one of the hardest games -- Montezuma's revenge -- for which the ability to handle sparse rewards is key.

Playing hard exploration games by watching YouTube

MaxSobolMark/HardRLWithYoutube NeurIPS 2018

One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator.

Empowerment-driven Exploration using Mutual Information Estimation

navneet-nmk/pytorch-rl 11 Oct 2018

However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments.

Using Natural Language for Reward Shaping in Reinforcement Learning

prasoongoyal/rl-learn 5 Mar 2019

A common approach to reduce interaction time with the environment is to use reward shaping, which involves carefully designing reward functions that provide the agent intermediate rewards for progress towards the goal.

Uncertainty - sensitive learning and planning with ensembles

learningandplanningICLR/learningandplanning 25 Sep 2019

Notably, our method performs well in environments with sparse rewards where standard $TD(1)$ backups fail.

DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

grockious/deepsynth 22 Nov 2019

This paper proposes DeepSynth, a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives.