Search Results for author: Gregory Farquhar

Found 22 papers, 17 papers with code

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

1 code implementation • NeurIPS 2023 • Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster

Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks.

General Reinforcement Learning reinforcement-learning +1

Paper
Code

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

1 code implementation • 22 Sep 2022 • Risto Vuorio, Jacob Beck, Shimon Whiteson, Jakob Foerster, Gregory Farquhar

Meta-gradients provide a general approach for optimizing the meta-parameters of reinforcement learning (RL) algorithms.

Meta-Learning Reinforcement Learning (RL)

Paper
Code

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

no code implementations • 8 Dec 2021 • Angelos Filos, Eszter Vértes, Zita Marinho, Gregory Farquhar, Diana Borsa, Abram Friesen, Feryal Behbahani, Tom Schaul, André Barreto, Simon Osindero

Unlike prior work which estimates uncertainty by training an ensemble of many models and/or value functions, this approach requires only the single model and value function which are already being learned in most model-based reinforcement learning algorithms.

Model-based Reinforcement Learning Rolling Shutter Correction

Paper
Add Code

Self-Consistent Models and Values

no code implementations • NeurIPS 2021 • Gregory Farquhar, Kate Baumli, Zita Marinho, Angelos Filos, Matteo Hessel, Hado van Hasselt, David Silver

Learned models of the environment provide reinforcement learning (RL) agents with flexible ways of making predictions about the environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Proper Value Equivalence

1 code implementation • NeurIPS 2021 • Christopher Grimm, André Barreto, Gregory Farquhar, David Silver, Satinder Singh

The value-equivalence (VE) principle proposes a simple answer to this question: a model should capture the aspects of the environment that are relevant for value-based planning.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Code

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

1 code implementation • 24 Feb 2021 • Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar

This allows us to disentangle shared features and dynamics of the environment from agent-specific rewards and policies.

Autonomous Driving reinforcement-learning +1

Paper
Code

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

4 code implementations • NeurIPS 2020 • Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson

We show in particular that this projection can fail to recover the optimal policy even with access to $Q^*$, which primarily stems from the equal weighting placed on each joint action.

Multi-agent Reinforcement Learning Q-Learning +3

2,548

Paper
Code

Transient Non-Stationarity and Generalisation in Deep Reinforcement Learning

no code implementations • ICLR 2021 • Maximilian Igl, Gregory Farquhar, Jelena Luketina, Wendelin Boehmer, Shimon Whiteson

Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

1 code implementation • 19 Mar 2020 • Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson

At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and communication constraints are lifted.

Ranked #6 on SMAC on SMAC 6h_vs_8z

reinforcement-learning Reinforcement Learning (RL) +2

1,722

Paper
Code

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Gradient Estimators for Reinforcement Learning

1 code implementation • NeurIPS 2019 • Gregory Farquhar, Shimon Whiteson, Jakob Foerster

Gradient-based methods for optimisation of objectives in stochastic settings with unknown or intractable dynamics require estimators of derivatives.

Continuous Control Meta Reinforcement Learning +2

Paper
Code

Loaded DiCE: Trading off Bias and Variance in Any-Order Score Function Estimators for Reinforcement Learning

1 code implementation • 23 Sep 2019 • Gregory Farquhar, Shimon Whiteson, Jakob Foerster

Gradient-based methods for optimisation of objectives in stochastic settings with unknown or intractable dynamics require estimators of derivatives.

Continuous Control Meta Reinforcement Learning +2

Paper
Code

Growing Action Spaces

1 code implementation • ICML 2020 • Gregory Farquhar, Laura Gustafson, Zeming Lin, Shimon Whiteson, Nicolas Usunier, Gabriel Synnaeve

In complex tasks, such as those with large combinatorial action spaces, random exploration may be too inefficient to achieve meaningful learning progress.

reinforcement-learning Reinforcement Learning (RL) +1

644

Paper
Code

A Survey of Reinforcement Learning Informed by Natural Language

no code implementations • 10 Jun 2019 • Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, Tim Rocktäschel

To be successful in real-world tasks, Reinforcement Learning (RL) needs to exploit the compositional, relational, and hierarchical structure of the world, and learn to transfer it to the task at hand.

Decision Making Instruction Following +5

Paper
Add Code

The StarCraft Multi-Agent Challenge

20 code implementations • 11 Feb 2019 • Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, Shimon Whiteson

In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap.

Ranked #6 on SMAC on SMAC 6h_vs_8z

Benchmarking Reinforcement Learning (RL) +3

1,722

Paper
Code

Multi-Agent Common Knowledge Reinforcement Learning

1 code implementation • NeurIPS 2019 • Christian A. Schroeder de Witt, Jakob N. Foerster, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson

In this paper, we show that common knowledge between agents allows for complex decentralised coordination.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Code

A Better Baseline for Second Order Gradient Estimation in Stochastic Computation Graphs

no code implementations • 27 Sep 2018 • Jingkai Mao, Jakob Foerster, Tim Rocktäschel, Gregory Farquhar, Maruan Al-Shedivat, Shimon Whiteson

To improve the sample efficiency of DiCE, we propose a new baseline term for higher order gradient estimation.

Meta-Learning Multi-agent Reinforcement Learning +2

Paper
Add Code

DiCE: The Infinitely Differentiable Monte Carlo Estimator

1 code implementation • ICML 2018 • Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric Xing, Shimon Whiteson

Lastly, to match the first-order gradient under differentiation, SL treats part of the cost as a fixed sample, which we show leads to missing and wrong terms for estimators of higher-order derivatives.

Meta-Learning

137

Paper
Code

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

16 code implementations • ICML 2018 • Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson

At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.

Ranked #1 on SMAC+ on Off_Near_parallel

Multi-agent Reinforcement Learning reinforcement-learning +4

31,092

Paper
Code

DiCE: The Infinitely Differentiable Monte-Carlo Estimator

5 code implementations • 14 Feb 2018 • Jakob Foerster, Gregory Farquhar, Maruan Al-Shedivat, Tim Rocktäschel, Eric P. Xing, Shimon Whiteson

Meta-Learning

137

Paper
Code

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

1 code implementation • ICLR 2018 • Gregory Farquhar, Tim Rocktäschel, Maximilian Igl, Shimon Whiteson

To address these challenges, we propose TreeQN, a differentiable, recursive, tree-structured model that serves as a drop-in replacement for any value function network in deep RL with discrete actions.

Atari Games reinforcement-learning +2

Paper
Code

Counterfactual Multi-Agent Policy Gradients

6 code implementations • 24 May 2017 • Jakob Foerster, Gregory Farquhar, Triantafyllos Afouras, Nantas Nardelli, Shimon Whiteson

COMA uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.

Ranked #1 on SMAC+ on Off_Superhard_parallel

Autonomous Vehicles counterfactual +2

2,548

Paper
Code

Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning

5 code implementations • ICML 2017 • Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip H. S. Torr, Pushmeet Kohli, Shimon Whiteson

Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems.

Multi-agent Reinforcement Learning Q-Learning +3

352

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.