Montezuma's Revenge
28 papers with code • 1 benchmarks • 1 datasets
Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.
For the state-of-the art tables, please consult the parent Atari Games task.
( Image credit: Q-map )
Latest papers with no code
Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem
Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successful applications of foundation models.
Int-HRL: Towards Intention-based Hierarchical Reinforcement Learning
We show that intentions of human players, i. e. the precursor of goal-oriented decisions, can be robustly predicted from eye gaze even for the long-horizon sparse rewards task of Montezuma's Revenge - one of the most challenging RL tasks in the Atari2600 game suite.
Sample Efficient Deep Reinforcement Learning via Local Planning
One useful property of simulators is that it is typically easy to reset the environment to a previously observed state.
Curiosity in Hindsight: Intrinsic Exploration in Stochastic Environments
In this work, we study a natural solution derived from structural causal models of the world: Our key idea is to learn representations of the future that capture precisely the unpredictable aspects of each outcome -- which we use as additional input for predictions, such that intrinsic rewards only reflect the predictable aspects of world dynamics.
Paused Agent Replay Refresh
Paused Agent Replay Refresh (PARR) is a drop-in replacement for target networks that supports more complex learning algorithms without this need for approximation.
GAN-based Intrinsic Exploration For Sample Efficient Reinforcement Learning
In this study, we address the problem of efficient exploration in reinforcement learning.
Parametrically Retargetable Decision-Makers Tend To Seek Power
We show that a range of qualitatively dissimilar decision-making procedures incentivize agents to seek power.
Understanding and Preventing Capacity Loss in Reinforcement Learning
The reinforcement learning (RL) problem is rife with sources of non-stationarity, making it a notoriously difficult problem domain for the application of neural networks.
Generative Adversarial Exploration for Reinforcement Learning
Exploration is crucial for training the optimal reinforcement learning (RL) policy, where the key is to discriminate whether a state visiting is novel.
Exploration by Random Network Distillation
In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.