Search Results for author: Dustin Morrill

Found 15 papers, 6 papers with code

Composing Efficient, Robust Tests for Policy Selection

no code implementations12 Jun 2023 Dustin Morrill, Thomas J. Walsh, Daniel Hernandez, Peter R. Wurman, Peter Stone

Empirical results demonstrate that RPOSST finds a small set of test cases that identify high quality policies in a toy one-shot game, poker datasets, and a high-fidelity racing simulator.

Interpolating Between Softmax Policy Gradient and Neural Replicator Dynamics with Capped Implicit Exploration

no code implementations4 Jun 2022 Dustin Morrill, Esra'a Saleh, Michael Bowling, Amy Greenwald

Neural replicator dynamics (NeuRD) is an alternative to the foundational softmax policy gradient (SPG) algorithm motivated by online learning and evolutionary game theory.

Decision Making

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games: Corrections

1 code implementation24 May 2022 Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy R. Greenwald

Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.

counterfactual Decision Making

The Partially Observable History Process

no code implementations15 Nov 2021 Dustin Morrill, Amy R. Greenwald, Michael Bowling

We introduce the partially observable history process (POHP) formalism for reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Learning to Be Cautious

no code implementations29 Oct 2021 Montaser Mohammedalamen, Dustin Morrill, Alexander Sieusahai, Yash Satsangi, Michael Bowling

An agent that could learn to be cautious would overcome this challenge by discovering for itself when and how to behave cautiously.

counterfactual

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

1 code implementation13 Feb 2021 Dustin Morrill, Ryan D'Orazio, Marc Lanctot, James R. Wright, Michael Bowling, Amy Greenwald

Hindsight rationality is an approach to playing general-sum games that prescribes no-regret learning dynamics for individual agents with respect to a set of deviations, and further describes jointly rational behavior among multiple agents with mediated equilibria.

counterfactual Decision Making

Hindsight and Sequential Rationality of Correlated Play

1 code implementation10 Dec 2020 Dustin Morrill, Ryan D'Orazio, Reca Sarfati, Marc Lanctot, James R. Wright, Amy Greenwald, Michael Bowling

This approach also leads to a game-theoretic analysis, but in the correlated play that arises from joint learning dynamics rather than factored agent behavior at equilibrium.

counterfactual Decision Making +1

Alternative Function Approximation Parameterizations for Solving Games: An Analysis of $f$-Regression Counterfactual Regret Minimization

no code implementations6 Dec 2019 Ryan D'Orazio, Dustin Morrill, James R. Wright, Michael Bowling

In contrast, the more conventional softmax parameterization is standard in the field of reinforcement learning and yields a regret bound with a better dependence on the number of actions.

counterfactual regression +2

Bounds for Approximate Regret-Matching Algorithms

no code implementations3 Oct 2019 Ryan D'Orazio, Dustin Morrill, James R. Wright

A common approach to incorporating function approximation is to learn the inputs needed for a regret minimizing algorithm, allowing for generalization across many regret minimization problems.

regression

Computing Approximate Equilibria in Sequential Adversarial Games by Exploitability Descent

no code implementations13 Mar 2019 Edward Lockhart, Marc Lanctot, Julien Pérolat, Jean-Baptiste Lespiau, Dustin Morrill, Finbarr Timbers, Karl Tuyls

In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents.

counterfactual

DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker

1 code implementation6 Jan 2017 Matej Moravčík, Martin Schmid, Neil Burch, Viliam Lisý, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, Michael Bowling

Poker is the quintessential game of imperfect information, and a longstanding challenge problem in artificial intelligence.

Game of Poker

Solving Games with Functional Regret Estimation

no code implementations28 Nov 2014 Kevin Waugh, Dustin Morrill, J. Andrew Bagnell, Michael Bowling

We propose a novel online learning method for minimizing regret in large extensive-form games.

Cannot find the paper you are looking for? You can Submit a new open access paper.