Search Results for author: Brendan D. Tracey

Found 8 papers, 4 papers with code

Towards practical reinforcement learning for tokamak magnetic control

no code implementations • 21 Jul 2023 • Brendan D. Tracey, Andrea Michi, Yuri Chervonyi, Ian Davies, Cosmin Paduraru, Nevena Lazic, Federico Felici, Timo Ewalds, Craig Donner, Cristian Galperti, Jonas Buchli, Michael Neunert, Andrea Huber, Jonathan Evens, Paula Kurylowicz, Daniel J. Mankowitz, Martin Riedmiller, The TCV Team

Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

From Motor Control to Team Play in Simulated Humanoid Football

1 code implementation • 25 May 2021 • SiQi Liu, Guy Lever, Zhe Wang, Josh Merel, S. M. Ali Eslami, Daniel Hennes, Wojciech M. Czarnecki, Yuval Tassa, Shayegan Omidshafiei, Abbas Abdolmaleki, Noah Y. Siegel, Leonard Hasenclever, Luke Marris, Saran Tunyasuvunakool, H. Francis Song, Markus Wulfmeier, Paul Muller, Tuomas Haarnoja, Brendan D. Tracey, Karl Tuyls, Thore Graepel, Nicolas Heess

In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds.

Imitation Learning Multi-agent Reinforcement Learning +1

3,546

Paper
Code

Caveats for information bottleneck in deterministic scenarios

1 code implementation • ICLR 2019 • Artemy Kolchinsky, Brendan D. Tracey, Steven Van Kuyk

We demonstrate three caveats when using IB in any situation where $Y$ is a deterministic function of $X$: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of $\beta$; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal.

Paper
Code

Upgrading from Gaussian Processes to Student's-T Processes

no code implementations • 18 Jan 2018 • Brendan D. Tracey, David H. Wolpert

The Student's-T distribution has higher Kurtosis than a Gaussian distribution and so outliers are much more likely, and the posterior variance increases or decreases depending on the variance of observed data sample values.

Bayesian Optimization Gaussian Processes

Paper
Add Code

Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes

1 code implementation • 19 Sep 2017 • Kunal Menda, Yi-Chun Chen, Justin Grana, James W. Bono, Brendan D. Tracey, Mykel J. Kochenderfer, David Wolpert

The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Estimating Mixture Entropy with Pairwise Distances

no code implementations • 8 Jun 2017 • Artemy Kolchinsky, Brendan D. Tracey

We prove this family includes lower and upper bounds on the mixture entropy.

Paper
Add Code

Nonlinear Information Bottleneck

3 code implementations • 6 May 2017 • Artemy Kolchinsky, Brendan D. Tracey, David H. Wolpert

Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$.

Paper
Code

Reducing the error of Monte Carlo Algorithms by Learning Control Variates

no code implementations • 7 Jun 2016 • Brendan D. Tracey, David H. Wolpert

Crucially, it is a post-processing technique, requiring no additional samples, and can be applied to data generated by any MC estimator.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.