Search Results for author: Mastane Achab

Found 12 papers, 1 papers with code

A Risk-Averse Framework for Non-Stationary Stochastic Multi-Armed Bandits

no code implementations • 24 Oct 2023 • REDA ALAMI, Mohammed Mahfoud, Mastane Achab

In a typical stochastic multi-armed bandit problem, the objective is often to maximize the expected sum of rewards over some time horizon $T$.

Change Point Detection Multi-Armed Bandits

Paper
Add Code

Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study

no code implementations • 4 Oct 2023 • Fouzi Boukhalfa, REDA ALAMI, Mastane Achab, Eric Moulines, Mehdi Bennis

In today's era, autonomous vehicles demand a safety level on par with aircraft.

Autonomous Vehicles Benchmarking +1

Paper
Add Code

Beyond Log-Concavity: Theory and Algorithm for Sum-Log-Concave Optimization

no code implementations • 26 Sep 2023 • Mastane Achab

In particular, we show that such functions are in general not convex but still satisfy generalized convexity inequalities.

regression

Paper
Add Code

A Nested Matrix-Tensor Model for Noisy Multi-view Clustering

no code implementations • 31 May 2023 • Mohamed El Amine Seddik, Mastane Achab, Henrique Goulart, Merouane Debbah

In order to study the theoretical performance of this approach, we characterize the behavior of this best rank-one approximation in terms of the alignments of the obtained component vectors with the hidden model parameter vectors, in the large-dimensional regime.

Clustering

Paper
Add Code

One-Step Distributional Reinforcement Learning

no code implementations • 27 Apr 2023 • Mastane Achab, REDA ALAMI, Yasser Abdelaziz Dahou Djilali, Kirill Fedyanin, Eric Moulines

Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term expected return.

Distributional Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Robustness and risk management via distributional dynamic programming

no code implementations • 28 Dec 2021 • Mastane Achab, Gergely Neu

In dynamic programming (DP) and reinforcement learning (RL), an agent learns to act optimally in terms of expected long-term return by sequentially interacting with its environment modeled by a Markov decision process (MDP).

Distributional Reinforcement Learning Management +2

Paper
Add Code

Weighted Empirical Risk Minimization: Sample Selection Bias Correction based on Importance Sampling

no code implementations • 12 Feb 2020 • Robin Vogel, Mastane Achab, Stéphan Clémençon, Charles Tillier

We consider statistical learning problems, when the distribution $P'$ of the training observations $Z'_1,\; \ldots,\; Z'_n$ differs from the distribution $P$ involved in the risk one seeks to minimize (referred to as the test distribution) but is still defined on the same measurable space as $P$ and dominates it.

Selection bias Transfer Learning

Paper
Add Code

Weighted Empirical Risk Minimization: Transfer Learning based on Importance Sampling

no code implementations • 25 Sep 2019 • Robin Vogel, Mastane Achab, Charles Tillier, Stéphan Clémençon

We consider statistical learning problems, when the distribution $P'$ of the training observations $Z'_1,\; \ldots,\; Z'_n$ differs from the distribution $P$ involved in the risk one seeks to minimize (referred to as the \textit{test distribution}) but is still defined on the same measurable space as $P$ and dominates it.

Transfer Learning

Paper
Add Code

Dimensionality Reduction and (Bucket) Ranking: a Mass Transportation Approach

1 code implementation • 15 Oct 2018 • Mastane Achab, Anna Korba, Stephan Clémençon

Whereas most dimensionality reduction techniques (e. g. PCA, ICA, NMF) for multivariate data essentially rely on linear algebra to a certain extent, summarizing ranking data, viewed as realizations of a random permutation $\Sigma$ on a set of items indexed by $i\in \{1,\ldots,\; n\}$, is a great statistical challenge, due to the absence of vector space structure for the set of permutations $\mathfrak{S}_n$.

Dimensionality Reduction

Paper
Code

Profitable Bandits

no code implementations • 8 May 2018 • Mastane Achab, Stephan Clémençon, Aurélien Garivier

We adapt and study three well-known strategies in this purpose, that were proved to be most efficient in other settings: kl-UCB, Bayes-UCB and Thompson Sampling.

Management Thompson Sampling

Paper
Add Code

Ranking Data with Continuous Labels through Oriented Recursive Partitions

no code implementations • NeurIPS 2017 • Stephan Clémençon, Mastane Achab

This problem generalizes bi/multi-partite ranking to a certain extent and the task of finding optimal scoring functions s(x) can be naturally cast as optimization of a dedicated functional criterion, called the IROC curve here, or as maximization of the Kendall ${\tau}$ related to the pair (s(X), Y ).

Paper
Add Code

Max K-armed bandit: On the ExtremeHunter algorithm and beyond

no code implementations • 27 Jul 2017 • Mastane Achab, Stephan Clémençon, Aurélien Garivier, Anne Sabourin, Claire Vernade

This paper is devoted to the study of the max K-armed bandit problem, which consists in sequentially allocating resources in order to detect extreme values.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.