no code implementations • 3 Jul 2020 • Adam Bignold, Francisco Cruz, Matthew E. Taylor, Tim Brys, Richard Dazeley, Peter Vamplew, Cameron Foale
In this work, while reviewing externally-influenced methods, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering collaboration by classifying and comparing various methods that use external information in the learning process.
no code implementations • 13 Aug 2018 • Hélène Plisnier, Denis Steckelmacher, Tim Brys, Diederik M. Roijers, Ann Nowé
Our technique, Directed Policy Gradient (DPG), allows a teacher or backup policy to override the agent before it acts undesirably, while allowing the agent to leverage human advice or directives to learn faster.
no code implementations • 2 May 2015 • William Curran, Tim Brys, Matthew Taylor, William Smart
When using dimensionality reduction in Mario, learning converges much faster to a good policy.
no code implementations • 11 Feb 2015 • Anna Harutyunyan, Tim Brys, Peter Vrancx, Ann Nowe
While PBRS is proven to always preserve optimal policies, its effect on learning speed is determined by the quality of its potential function, which, in turn, depends on both the underlying heuristic and the scale.
no code implementations • 21 May 2014 • Anna Harutyunyan, Tim Brys, Peter Vrancx, Ann Nowe
Recent advances of gradient temporal-difference methods allow to learn off-policy multiple value functions in parallel with- out sacrificing convergence guarantees or computational efficiency.