no code implementations • 13 Dec 2023 • Reda Ouhamma, Maryam Kamgarpour
We consider decentralized learning for zero-sum games, where players only see their payoff information and are agnostic to actions and payoffs of the opponent.
no code implementations • 5 Oct 2022 • Reda Ouhamma, Debabrota Basu, Odalric-Ambrym Maillard
Our regret bound is order-optimal with respect to $H$ and $K$.
no code implementations • NeurIPS 2021 • Reda Ouhamma, Odalric Maillard, Vianney Perchet
We consider the problem of online linear regression in the stochastic setting.
no code implementations • NeurIPS 2021 • Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet
In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.
no code implementations • ICLR 2021 • Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux
We prove the theoretical consistency of the new gradient estimator and observe dramatic empirical improvement across a variety of continuous control tasks and algorithms.