1 code implementation • 21 Mar 2022 • Dorian Baudry, Yoan Russac, Emilie Kaufmann
In this paper, we contribute to the Extreme Bandit problem, a variant of Multi-Armed Bandits in which the learner seeks to collect the largest possible reward.
no code implementations • NeurIPS 2021 • Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen
At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation.
1 code implementation • 21 Jun 2021 • Dorian Baudry, Yoan Russac, Olivier Cappé
There has been a recent surge of interest in nonparametric bandit algorithms based on subsampling.
no code implementations • 9 Mar 2021 • Louis Faury, Yoan Russac, Marc Abeille, Clément Calauzènes
Generalized Linear Bandits (GLBs) are powerful extensions to the Linear Bandit (LB) setting, broadening the benefits of reward parametrization beyond linearity.
no code implementations • 2 Nov 2020 • Yoan Russac, Louis Faury, Olivier Cappé, Aurélien Garivier
Contextual sequential decision problems with categorical or numerical observations are ubiquitous and Generalized Linear Bandits (GLB) offer a solid theoretical framework to address them.
no code implementations • 23 Mar 2020 • Yoan Russac, Olivier Cappé, Aurélien Garivier
The statistical framework of Generalized Linear Models (GLM) can be applied to sequential problems involving categorical or ordinal rewards associated, for instance, with clicks, likes or ratings.
1 code implementation • NeurIPS 2019 • Yoan Russac, Claire Vernade, Olivier Cappé
To address this problem, we propose D-LinUCB, a novel optimistic algorithm based on discounted linear regression, where exponential weights are used to smoothly forget the past.