Search Results for author: Oleg Klimov

Found 6 papers, 5 papers with code

Phasic Policy Gradient

3 code implementations9 Sep 2020 Karl Cobbe, Jacob Hilton, Oleg Klimov, John Schulman

We introduce Phasic Policy Gradient (PPG), a reinforcement learning framework which modifies traditional on-policy actor-critic methods by separating policy and value function training into distinct phases.

 Ranked #1 on Reinforcement Learning (RL) on ProcGen (using extra training data)

Reinforcement Learning (RL)

Exploration by Random Network Distillation

21 code implementations ICLR 2019 Yuri Burda, Harrison Edwards, Amos Storkey, Oleg Klimov

In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.

Montezuma's Revenge reinforcement-learning +2

Gotta Learn Fast: A New Benchmark for Generalization in RL

3 code implementations10 Apr 2018 Alex Nichol, Vicki Pfau, Christopher Hesse, Oleg Klimov, John Schulman

In this report, we present a new reinforcement learning (RL) benchmark based on the Sonic the Hedgehog (TM) video game franchise.

Few-Shot Learning reinforcement-learning +2

Proximal Policy Optimization Algorithms

171 code implementations20 Jul 2017 John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.

Continuous Control Dota 2 +3

Cannot find the paper you are looking for? You can Submit a new open access paper.