Search Results for author: Rémy Degenne

Found 14 papers, 2 papers with code

On the Existence of a Complexity in Fixed Budget Bandit Identification

no code implementations16 Mar 2023 Rémy Degenne

We show that there is no such complexity for several fixed budget identification tasks including Bernoulli best arm identification with two arms: there is no single algorithm that attains everywhere the best possible rate.

Dealing with Unknown Variances in Best-Arm Identification

no code implementations3 Oct 2022 Marc Jourdan, Rémy Degenne, Emilie Kaufmann

The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known.

Top Two Algorithms Revisited

no code implementations13 Jun 2022 Marc Jourdan, Rémy Degenne, Dorian Baudry, Rianne de Heide, Emilie Kaufmann

Top Two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models (Russo, 2016), for parametric families of arms.

Thompson Sampling Vocal Bursts Valence Prediction

Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

no code implementations9 Jun 2022 Marc Jourdan, Rémy Degenne

In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment.

On Elimination Strategies for Bandit Fixed-Confidence Identification

1 code implementation22 May 2022 Andrea Tirinzoni, Rémy Degenne

Elimination algorithms for bandit identification, which prune the plausible correct answers sequentially until only one remains, are computationally convenient since they reduce the problem size over time.

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

1 code implementation NeurIPS 2021 Clémence Réda, Andrea Tirinzoni, Rémy Degenne

In this work, we first derive a tractable lower bound on the sample complexity of any $\delta$-correct algorithm for the general Top-m identification problem.

Recommendation Systems

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

no code implementations NeurIPS 2021 Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.

Gamification of Pure Exploration for Linear Bandits

no code implementations ICML 2020 Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko

We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits.

Experimental Design

Structure Adaptive Algorithms for Stochastic Bandits

no code implementations ICML 2020 Rémy Degenne, Han Shao, Wouter M. Koolen

We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e. g. linear, unimodal, sparse, etc.

Non-Asymptotic Pure Exploration by Solving Games

no code implementations NeurIPS 2019 Rémy Degenne, Wouter M. Koolen, Pierre Ménard

Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment.

Pure Exploration with Multiple Correct Answers

no code implementations NeurIPS 2019 Rémy Degenne, Wouter M. Koolen

We present a new algorithm which extends Track-and-Stop to the multiple-answer case and has asymptotic sample complexity matching the lower bound.

Bridging the gap between regret minimization and best arm identification, with application to A/B tests

no code implementations9 Oct 2018 Rémy Degenne, Thomas Nedelec, Clément Calauzènes, Vianney Perchet

State of the art online learning procedures focus either on selecting the best alternative ("best arm identification") or on minimizing the cost (the "regret").

Bandits with Side Observations: Bounded vs. Logarithmic Regret

no code implementations10 Jul 2018 Rémy Degenne, Evrard Garcelon, Vianney Perchet

We consider the classical stochastic multi-armed bandit but where, from time to time and roughly with frequency $\epsilon$, an extra observation is gathered by the agent for free.

Combinatorial semi-bandit with known covariance

no code implementations NeurIPS 2016 Rémy Degenne, Vianney Perchet

We introduce a way to quantify the dependency structure of the problem and design an algorithm that adapts to it.

Cannot find the paper you are looking for? You can Submit a new open access paper.