no code implementations • 25 Apr 2024 • Radovan Haluska, Martin Schmid
While Poker, as a family of games, has been studied extensively in the last decades, collectible card games have seen relatively little attention.
no code implementations • 2 Mar 2023 • David Sychrovský, Michal Šustr, Elnaz Davoodi, Michael Bowling, Marc Lanctot, Martin Schmid
As these similar games feature similar equilibra, we investigate a way to accelerate equilibrium finding on such a distribution.
no code implementations • 6 Dec 2021 • Martin Schmid, Matej Moravcik, Neil Burch, Rudolf Kadlec, Josh Davidson, Kevin Waugh, Nolan Bard, Finbarr Timbers, Marc Lanctot, G. Zacharias Holland, Elnaz Davoodi, Alden Christianson, Michael Bowling
Games have a long history as benchmarks for progress in artificial intelligence.
no code implementations • 10 Nov 2021 • Martin Schmid
From the very dawn of the field, search with value functions was a fundamental concept of computer games research.
2 code implementations • 11 Jan 2021 • Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot
While this choice precludes CAPI from scaling to games as large as Hanabi, empirical results demonstrate that, on the games to which CAPI does scale, it is capable of discovering optimal joint policies even when other modern multi-agent reinforcement learning algorithms are unable to do so.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 27 Aug 2020 • Audrūnas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Perolat, Dustin Morrill, Vinicius Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls
In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior.
no code implementations • 20 Apr 2020 • Finbarr Timbers, Nolan Bard, Edward Lockhart, Marc Lanctot, Martin Schmid, Neil Burch, Julian Schrittwieser, Thomas Hubert, Michael Bowling
In prior games research, agent evaluation often focused on the in-practice game outcomes.
no code implementations • ICML 2020 • Trevor Davis, Martin Schmid, Michael Bowling
In this paper, we extend recent work that uses baseline estimates to reduce this variance.
no code implementations • 26 Jun 2019 • Vojtěch Kovařík, Martin Schmid, Neil Burch, Michael Bowling, Viliam Lisý
A second issue is that while EFGs have recently seen significant algorithmic progress, their classical formalization is unsuitable for efficient presentation of the underlying ideas, such as those around decomposition.
no code implementations • 9 Sep 2018 • Martin Schmid, Neil Burch, Marc Lanctot, Matej Moravcik, Rudolf Kadlec, Michael Bowling
The new formulation allows estimates to be bootstrapped from other estimates within the same episode, propagating the benefits of baselines along the sampled trajectory; the estimates remain unbiased even when bootstrapping from other estimates.
1 code implementation • 6 Jan 2017 • Matej Moravčík, Martin Schmid, Neil Burch, Viliam Lisý, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, Michael Bowling
Poker is the quintessential game of imperfect information, and a longstanding challenge problem in artificial intelligence.
no code implementations • 20 Dec 2016 • Neil Burch, Martin Schmid, Matej Moravčík, Michael Bowling
Evaluating agent performance when outcomes are stochastic and agents use randomized strategies can be challenging when there is limited data available.
2 code implementations • ACL 2016 • Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, Jan Kleindienst
Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children's Book Test.
Ranked #5 on Open-Domain Question Answering on SearchQA (Unigram Acc metric)
Machine Reading Comprehension Open-Domain Question Answering
no code implementations • 13 Oct 2015 • Rudolf Kadlec, Martin Schmid, Jan Kleindienst
The ensemble further improves the performance and it achieves a state-of-the-art result for the next utterance ranking on this dataset.