1 code implementation • 5 Dec 2023 • Marc Lanctot, Kate Larson, Yoram Bachrach, Luke Marris, Zun Li, Avishkar Bhoopchand, Thomas Anthony, Brian Tanner, Anna Koop
We argue that many general evaluation problems can be viewed through the lens of voting theory.
no code implementations • 7 Feb 2022 • Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner, Adam White
Each subtask is solved to produce an option, and then a model of the option is learned and made available to the planning process.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • NeurIPS 2004 • Richard S. Sutton, Brian Tanner
We introduce a generalization of temporal-difference (TD) learning to networks of interrelated predictions.