Search Results for author: Ryan D'Orazio

Found 11 papers, 6 papers with code

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

no code implementations • 22 Jan 2023 • Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown

Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.

Vocal Bursts Valence Prediction

Paper
Add Code

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

3 code implementations • 12 Jun 2022 • Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm.

MuJoCo Games reinforcement-learning +1

4,003

Paper
Code

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games: Corrections

1 code implementation • 24 May 2022 • Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy R. Greenwald

Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.

counterfactual Decision Making

Paper
Code

Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize

no code implementations • 28 Oct 2021 • Ryan D'Orazio, Nicolas Loizou, Issam Laradji, Ioannis Mitliagkas

We investigate the convergence of stochastic mirror descent (SMD) under interpolation in relatively smooth and smooth convex optimization.

Paper
Add Code

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

1 code implementation • 13 Feb 2021 • Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy Greenwald

counterfactual Decision Making

Paper
Code

Optimistic and Adaptive Lagrangian Hedging

no code implementations • 23 Jan 2021 • Ryan D'Orazio, Ruitong Huang

The generality of this framework includes problems that are not adversarial, for example offline optimization, or saddle point problems (i. e. min max optimization).

Paper
Add Code

Solving Common-Payoff Games with Approximate Policy Iteration

2 code implementations • 11 Jan 2021 • Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

Hindsight and Sequential Rationality of Correlated Play

1 code implementation • 10 Dec 2020 • Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy Greenwald, Michael Bowling

This approach also leads to a game-theoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium.

counterfactual Decision Making +1

Paper
Code

Alternative Function Approximation Parameterizations for Solving Games: An Analysis of $f$-Regression Counterfactual Regret Minimization

no code implementations • 6 Dec 2019 • Ryan D'Orazio, Dustin Morrill, James R. Wright, Michael Bowling

In contrast, the more conventional softmax parameterization is standard in the field of reinforcement learning and yields a regret bound with a better dependence on the number of actions.

counterfactual regression +2

Paper
Add Code

Bounds for Approximate Regret-Matching Algorithms

no code implementations • 3 Oct 2019 • Ryan D'Orazio, Dustin Morrill, James R. Wright

A common approach to incorporating function approximation is to learn the inputs needed for a regret minimizing algorithm, allowing for generalization across many regret minimization problems.

regression

Paper
Add Code

Simultaneous Prediction Intervals for Patient-Specific Survival Curves

1 code implementation • 25 Jun 2019 • Samuel Sokota, Ryan D'Orazio, Khurram Javed, Humza Haider, Russell Greiner

In this paper, we demonstrate that an existing method for estimating simultaneous prediction intervals from samples can easily be adapted for patient-specific survival curve analysis and yields accurate results.

Descriptive Prediction Intervals +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.