no code implementations • 27 Nov 2023 • Thomas Kleine Buening, Aadirupa Saha, Christos Dimitrakakis, Haifeng Xu
We study a strategic variant of the multi-armed bandit problem, which we coin the strategic click-bandit.
1 code implementation • 21 Feb 2023 • Thomas Kleine Buening, Christos Dimitrakakis, Hannes Eriksson, Divya Grover, Emilio Jorge
While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution.
1 code implementation • 26 Oct 2022 • Thomas Kleine Buening, Victor Villin, Christos Dimitrakakis
Even with abundant data, current inverse reinforcement learning methods that focus on learning from a single environment can fail to handle slight changes in the environment dynamics.
no code implementations • 25 Oct 2022 • Thomas Kleine Buening, Aadirupa Saha
We study the problem of non-stationary dueling bandits and provide the first adaptive dynamic regret algorithm for this problem.
no code implementations • 8 Nov 2021 • Thomas Kleine Buening, Anne-Marie George, Christos Dimitrakakis
How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible?
no code implementations • 23 Feb 2021 • Thomas Kleine Buening, Meirav Segal, Debabrota Basu, Christos Dimitrakakis, Anne-Marie George
Typically, merit is defined with respect to some intrinsic measure of worth.