no code implementations • 18 Jun 2023 • Veronica Chelu, Tom Zahavy, Arthur Guez, Doina Precup, Sebastian Flennerhag
We work towards a unifying paradigm for accelerating policy optimization methods in reinforcement learning (RL) by integrating foresight in the policy improvement step via optimistic and adaptive updates.
no code implementations • 20 Feb 2022 • Veronica Chelu, Diana Borsa, Doina Precup, Hado van Hasselt
Efficient credit assignment is essential for reinforcement learning algorithms in both prediction and control settings.
no code implementations • 5 Jan 2022 • Anthony GX-Chen, Veronica Chelu, Blake A. Richards, Joelle Pineau
We illustrate that incorporating predictive knowledge through an $\eta\gamma$-discounted SF model makes more efficient use of sampled experience, compared to either extreme, i. e. bootstrapping entirely on the value function estimate, or bootstrapping on the product of separately estimated successor features and instantaneous reward models.
no code implementations • 12 Jul 2021 • Ray Jiang, Shangtong Zhang, Veronica Chelu, Adam White, Hado van Hasselt
We develop a multi-step emphatic weighting that can be combined with replay, and a time-reversed $n$-step TD learning algorithm to learn the required emphatic weighting.
no code implementations • NeurIPS 2020 • Veronica Chelu, Doina Precup, Hado van Hasselt
We address the problem of credit assignment in reinforcement learning and explore fundamental questions regarding the way in which an agent can best use additional computation to propagate new information, by planning with internal models of the world to improve its predictions.