no code implementations • 5 Jun 2023 • Tomáš Kocák, Alexandra Carpentier
Sequential learning with feedback graphs is a natural extension of the multi-armed bandit problem where the problem is equipped with an underlying graph structure that provides additional information - playing an action reveals the losses of all the neighbors of the action.
no code implementations • 13 Feb 2022 • Aymen Al Marjani, Tomáš Kocák, Aurélien Garivier
Our method is based on a complete characterization of the alternative bandit instances that the optimal sampling strategy needs to rule out, thus making our bound tighter than the one provided by \cite{Mason2020}.
no code implementations • 27 May 2021 • Antoine Barrier, Aurélien Garivier, Tomáš Kocák
We propose a new strategy for best-arm identification with fixed confidence of Gaussian variables with bounded means and unit variance.
no code implementations • 20 May 2020 • Tomáš Kocák, Aurélien Garivier
We study best-arm identification with fixed confidence in bandit models with graph smoothness constraint.
no code implementations • NeurIPS 2014 • Tomáš Kocák, Gergely Neu, Michal Valko, Remi Munos
As the predictions of our first algorithm cannot be always computed efficiently in this setting, we propose another algorithm with similar properties and with the benefit of always being computationally efficient, at the price of a slightly more complicated tuning mechanism.