Search Results for author: Ryan D'Orazio

Found 11 papers, 6 papers with code

Abstracting Imperfect Information Away from Two-Player Zero-Sum Games

no code implementations22 Jan 2023 Samuel Sokota, Ryan D'Orazio, Chun Kai Ling, David J. Wu, J. Zico Kolter, Noam Brown

Because these regularized equilibria can be made arbitrarily close to Nash equilibria, our result opens the door to a new perspective to solving two-player zero-sum games and yields a simplified framework for decision-time planning in two-player zero-sum games, void of the unappealing properties that plague existing decision-time planning approaches.

Vocal Bursts Valence Prediction

A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games

3 code implementations12 Jun 2022 Samuel Sokota, Ryan D'Orazio, J. Zico Kolter, Nicolas Loizou, Marc Lanctot, Ioannis Mitliagkas, Noam Brown, Christian Kroer

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm.

MuJoCo Games reinforcement-learning +1

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games: Corrections

1 code implementation24 May 2022 Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy R. Greenwald

Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.

counterfactual Decision Making

Stochastic Mirror Descent: Convergence Analysis and Adaptive Variants via the Mirror Stochastic Polyak Stepsize

no code implementations28 Oct 2021 Ryan D'Orazio, Nicolas Loizou, Issam Laradji, Ioannis Mitliagkas

We investigate the convergence of stochastic mirror descent (SMD) under interpolation in relatively smooth and smooth convex optimization.

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

1 code implementation13 Feb 2021 Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy Greenwald

Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.

counterfactual Decision Making

Optimistic and Adaptive Lagrangian Hedging

no code implementations23 Jan 2021 Ryan D'Orazio, Ruitong Huang

The generality of this framework includes problems that are not adversarial, for example offline optimization, or saddle point problems (i. e. min max optimization).

Solving Common-Payoff Games with Approximate Policy Iteration

2 code implementations11 Jan 2021 Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so.

Multi-agent Reinforcement Learning reinforcement-learning +1

Hindsight and Sequential Rationality of Correlated Play

1 code implementation10 Dec 2020 Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy Greenwald, Michael Bowling

This approach also leads to a game-theoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium.

counterfactual Decision Making +1

Alternative Function Approximation Parameterizations for Solving Games: An Analysis of $f$-Regression Counterfactual Regret Minimization

no code implementations6 Dec 2019 Ryan D'Orazio, Dustin Morrill, James R. Wright, Michael Bowling

In contrast, the more conventional softmax parameterization is standard in the field of reinforcement learning and yields a regret bound with a better dependence on the number of actions.

counterfactual regression +2

Bounds for Approximate Regret-Matching Algorithms

no code implementations3 Oct 2019 Ryan D'Orazio, Dustin Morrill, James R. Wright

A common approach to incorporating function approximation is to learn the inputs needed for a regret minimizing algorithm, allowing for generalization across many regret minimization problems.

regression

Simultaneous Prediction Intervals for Patient-Specific Survival Curves

1 code implementation25 Jun 2019 Samuel Sokota, Ryan D'Orazio, Khurram Javed, Humza Haider, Russell Greiner

In this paper, we demonstrate that an existing method for estimating simultaneous prediction intervals from samples can easily be adapted for patient-specific survival curve analysis and yields accurate results.

Descriptive Prediction Intervals +2

Cannot find the paper you are looking for? You can Submit a new open access paper.