Search Results for author: Martin Schmid

Found 14 papers, 3 papers with code

Learning to Beat ByteRL: Exploitability of Collectible Card Game Agents

no code implementations • 25 Apr 2024 • Radovan Haluska, Martin Schmid

While Poker, as a family of games, has been studied extensively in the last decades, collectible card games have seen relatively little attention.

Paper
Add Code

Learning not to Regret

no code implementations • 2 Mar 2023 • David Sychrovský, Michal Šustr, Elnaz Davoodi, Michael Bowling, Marc Lanctot, Martin Schmid

As these similar games feature similar equilibra, we investigate a way to accelerate equilibrium finding on such a distribution.

Paper
Add Code

Student of Games: A unified learning algorithm for both perfect and imperfect information games

no code implementations • 6 Dec 2021 • Martin Schmid, Matej Moravcik, Neil Burch, Rudolf Kadlec, Josh Davidson, Kevin Waugh, Nolan Bard, Finbarr Timbers, Marc Lanctot, G. Zacharias Holland, Elnaz Davoodi, Alden Christianson, Michael Bowling

Games have a long history as benchmarks for progress in artificial intelligence.

Paper
Add Code

Search in Imperfect Information Games

no code implementations • 10 Nov 2021 • Martin Schmid

From the very dawn of the field, search with value functions was a fundamental concept of computer games research.

Paper
Add Code

Solving Common-Payoff Games with Approximate Policy Iteration

2 code implementations • 11 Jan 2021 • Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Code

The Advantage Regret-Matching Actor-Critic

no code implementations • 27 Aug 2020 • Audrūnas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Perolat, Dustin Morrill, Vinicius Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls

In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior.

counterfactual Reinforcement Learning (RL)

Paper
Add Code

Approximate exploitability: Learning a best response in large games

no code implementations • 20 Apr 2020 • Finbarr Timbers, Nolan Bard, Edward Lockhart, Marc Lanctot, Martin Schmid, Neil Burch, Julian Schrittwieser, Thomas Hubert, Michael Bowling

In prior games research, agent evaluation often focused on the in-practice game outcomes.

Paper
Add Code

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

no code implementations • ICML 2020 • Trevor Davis, Martin Schmid, Michael Bowling

In this paper, we extend recent work that uses baseline estimates to reduce this variance.

counterfactual

Paper
Add Code

Rethinking Formal Models of Partially Observable Multiagent Decision Making

no code implementations • 26 Jun 2019 • Vojtěch Kovařík, Martin Schmid, Neil Burch, Michael Bowling, Viliam Lisý

A second issue is that while EFGs have recently seen significant algorithmic progress, their classical formalization is unsuitable for efficient presentation of the underlying ideas, such as those around decomposition.

counterfactual Decision Making +1

Paper
Add Code

Variance Reduction in Monte Carlo Counterfactual Regret Minimization (VR-MCCFR) for Extensive Form Games using Baselines

no code implementations • 9 Sep 2018 • Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling

The new formulation allows estimates to be bootstrapped from other estimates within the same episode, propagating the benefits of baselines along the sampled trajectory; the estimates remain unbiased even when bootstrapping from other estimates.

counterfactual

Paper
Add Code

DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker

1 code implementation • 6 Jan 2017 • Matej Moravčík, Martin Schmid, Neil Burch, Viliam Lisý, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, Michael Bowling

Poker is the quintessential game of imperfect information, and a longstanding challenge problem in artificial intelligence.

Game of Poker

854

Paper
Code

AIVAT: A New Variance Reduction Technique for Agent Evaluation in Imperfect Information Games

no code implementations • 20 Dec 2016 • Neil Burch, Martin Schmid, Matej Moravčík, Michael Bowling

Evaluating agent performance when outcomes are stochastic and agents use randomized strategies can be challenging when there is limited data available.

Paper
Add Code

Text Understanding with the Attention Sum Reader Network

2 code implementations • ACL 2016 • Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, Jan Kleindienst

Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children's Book Test.

Ranked #5 on Open-Domain Question Answering on SearchQA (Unigram Acc metric)

Machine Reading Comprehension Open-Domain Question Answering

Paper
Code

Improved Deep Learning Baselines for Ubuntu Corpus Dialogs

no code implementations • 13 Oct 2015 • Rudolf Kadlec, Martin Schmid, Jan Kleindienst

The ensemble further improves the performance and it achieves a state-of-the-art result for the next utterance ranking on this dataset.

Ranked #23 on Conversational Response Selection on Ubuntu Dialogue (v1, Ranking)

Conversational Response Selection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.