Search Results for author: Reda Ouhamma

Found 5 papers, 0 papers with code

Learning Nash Equilibria in Zero-Sum Markov Games: A Single Time-scale Algorithm Under Weak Reachability

no code implementations • 13 Dec 2023 • Reda Ouhamma, Maryam Kamgarpour

We consider decentralized learning for zero-sum games, where players only see their payoff information and are agnostic to actions and payoffs of the opponent.

Paper
Add Code

Bilinear Exponential Family of MDPs: Frequentist Regret Bound with Tractable Exploration and Planning

no code implementations • 5 Oct 2022 • Reda Ouhamma, Debabrota Basu, Odalric-Ambrym Maillard

Our regret bound is order-optimal with respect to $H$ and $K$.

Paper
Add Code

Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

no code implementations • NeurIPS 2021 • Reda Ouhamma, Odalric Maillard, Vianney Perchet

We consider the problem of online linear regression in the stochastic setting.

regression

Paper
Add Code

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

no code implementations • NeurIPS 2021 • Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.

Paper
Add Code

Learning Value Functions in Deep Policy Gradients using Residual Variance

no code implementations • ICLR 2021 • Yannis Flet-Berliac, Reda Ouhamma, Odalric-Ambrym Maillard, Philippe Preux

We prove the theoretical consistency of the new gradient estimator and observe dramatic empirical improvement across a variety of continuous control tasks and algorithms.

Continuous Control Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.