Rainbow: Combining Improvements in Deep Reinforcement Learning

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. However, it is unclear which of these extensions are complementary and can be fruitfully combined. This paper examines six extensions to the DQN algorithm and empirically studies their combination. Our experiments show that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance. We also provide results from a detailed ablation study that shows the contribution of each component to overall performance.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Montezuma's Revenge Atari 2600 Montezuma's Revenge Rainbow Average Return (NoOp) 384 # 3
Atari Games Atari 2600 Ms. Pacman Rainbow Score 2,570.2 # 47
Atari Games Atari 2600 Space Invaders Rainbow Score 12,629.0 # 54
Atari Games Atari-57 Rainbow DQN Human World Record Breakthrough 4 # 8
Mean Human Normalized Score 873.97% # 9
Atari Games atari game Rainbow Human World Record Breakthrough 4 # 9
Atari Games Atari games Rainbow DQN Mean Human Normalized Score 873.97% # 10

Methods