no code implementations • 24 Oct 2023 • REDA ALAMI, Mohammed Mahfoud, Mastane Achab
In a typical stochastic multi-armed bandit problem, the objective is often to maximize the expected sum of rewards over some time horizon $T$.
no code implementations • 4 Oct 2023 • Fouzi Boukhalfa, REDA ALAMI, Mastane Achab, Eric Moulines, Mehdi Bennis
In today's era, autonomous vehicles demand a safety level on par with aircraft.
no code implementations • 26 Sep 2023 • Mastane Achab
In particular, we show that such functions are in general not convex but still satisfy generalized convexity inequalities.
no code implementations • 31 May 2023 • Mohamed El Amine Seddik, Mastane Achab, Henrique Goulart, Merouane Debbah
In order to study the theoretical performance of this approach, we characterize the behavior of this best rank-one approximation in terms of the alignments of the obtained component vectors with the hidden model parameter vectors, in the large-dimensional regime.
no code implementations • 27 Apr 2023 • Mastane Achab, REDA ALAMI, Yasser Abdelaziz Dahou Djilali, Kirill Fedyanin, Eric Moulines
Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term expected return.
Distributional Reinforcement Learning reinforcement-learning +1
no code implementations • 28 Dec 2021 • Mastane Achab, Gergely Neu
In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP).
no code implementations • 12 Feb 2020 • Robin Vogel, Mastane Achab, Stéphan Clémençon, Charles Tillier
We consider statistical learning problems, when the distribution $P'$ of the training observations $Z'_1,\; \ldots,\; Z'_n$ differs from the distribution $P$ involved in the risk one seeks to minimize (referred to as the test distribution) but is still defined on the same measurable space as $P$ and dominates it.
no code implementations • 25 Sep 2019 • Robin Vogel, Mastane Achab, Charles Tillier, Stéphan Clémençon
We consider statistical learning problems, when the distribution $P'$ of the training observations $Z'_1,\; \ldots,\; Z'_n$ differs from the distribution $P$ involved in the risk one seeks to minimize (referred to as the \textit{test distribution}) but is still defined on the same measurable space as $P$ and dominates it.
1 code implementation • 15 Oct 2018 • Mastane Achab, Anna Korba, Stephan Clémençon
Whereas most dimensionality reduction techniques (e. g. PCA, ICA, NMF) for multivariate data essentially rely on linear algebra to a certain extent, summarizing ranking data, viewed as realizations of a random permutation $\Sigma$ on a set of items indexed by $i\in \{1,\ldots,\; n\}$, is a great statistical challenge, due to the absence of vector space structure for the set of permutations $\mathfrak{S}_n$.
no code implementations • 8 May 2018 • Mastane Achab, Stephan Clémençon, Aurélien Garivier
We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling.
no code implementations • NeurIPS 2017 • Stephan Clémençon, Mastane Achab
This problem generalizes bi/multi-partite ranking to a certain extent and the task of finding optimal scoring functions s(x) can be naturally cast as optimization of a dedicated functional criterion, called the IROC curve here, or as maximization of the Kendall ${\tau}$ related to the pair (s(X), Y ).
no code implementations • 27 Jul 2017 • Mastane Achab, Stephan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade
This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values.