Search Results for author: Pierre Gaillard

Found 34 papers, 7 papers with code

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

no code implementations ICML 2020 Aadirupa Saha, Pierre Gaillard, Michal Valko

The best existing efficient (i. e., polynomial-time) algorithms for this problem only guarantee a $O(T^{2/3})$ upper-bound on the regret.

Stop Relying on No-Choice and Do not Repeat the Moves: Optimal, Efficient and Practical Algorithms for Assortment Optimization

no code implementations29 Feb 2024 Aadirupa Saha, Pierre Gaillard

In this paper, we designed efficient algorithms for the problem of regret minimization in assortment selection with \emph{Plackett Luce} (PL) based user choices.

Recommendation Systems

Covariance-Adaptive Least-Squares Algorithm for Stochastic Combinatorial Semi-Bandits

no code implementations23 Feb 2024 Julien Zhou, Pierre Gaillard, Thibaud Rahier, Houssam Zenati, Julyan Arbel

We address the problem of stochastic combinatorial semi-bandits, where a player can select from P subsets of a set containing d base items.

Online Learning Approach for Survival Analysis

no code implementations7 Feb 2024 Camila Fernandez, Pierre Gaillard, Joseph de Vilmarest, Olivier Wintenberger

We introduce an online mathematical framework for survival analysis, allowing real time adaptation to dynamic environments and censored data.

Survival Analysis

Efficient Model-Based Concave Utility Reinforcement Learning through Greedy Mirror Descent

no code implementations30 Nov 2023 Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

Many machine learning tasks can be solved by minimizing a convex function of an occupancy measure over the policies that generate them.

Imitation Learning reinforcement-learning

Adaptive approximation of monotone functions

no code implementations14 Sep 2023 Pierre Gaillard, Sébastien Gerchinovitz, Étienne de Montbrun

We prove that GreedyBox achieves an optimal sample complexity for any function $f$, up to logarithmic factors.

Numerical Integration

Sequential Counterfactual Risk Minimization

1 code implementation23 Feb 2023 Houssam Zenati, Eustache Diemert, Matthieu Martin, Julien Mairal, Pierre Gaillard

Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data.

counterfactual

Reimagining Demand-Side Management with Mean Field Learning

no code implementations16 Feb 2023 Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

Integrating renewable energy into the power grid while balancing supply and demand is a complex issue, given its intermittent nature.

Management

One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits

no code implementations26 Oct 2022 Pierre Gaillard, Aadirupa Saha, Soham Dan

We address the problem of \emph{`Internal Regret'} in \emph{Sleeping Bandits} in the fully adversarial setup, as well as draw connections between different existing notions of sleeping regrets in the multiarmed bandits (MAB) literature and consequently analyze the implications: Our first contribution is to propose the new notion of \emph{Internal Regret} for sleeping MAB.

Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences

no code implementations14 Feb 2022 Aadirupa Saha, Pierre Gaillard

We study the problem of $K$-armed dueling bandit for both stochastic and adversarial environments, where the goal of the learner is to aggregate information through relative preferences of pair of decisions points queried in an online sequential manner.

Multi-Armed Bandits

Efficient Kernel UCB for Contextual Bandits

1 code implementation11 Feb 2022 Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard

While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems.

Computational Efficiency Multi-Armed Bandits

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

no code implementations NeurIPS 2021 Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.

Dueling Bandits with Adversarial Sleeping

no code implementations NeurIPS 2021 Aadirupa Saha, Pierre Gaillard

The goal is to find an optimal `no-regret' policy that can identify the best available item at each round, as opposed to the standard `fixed best-arm regret objective' of dueling bandits.

Management Multi-Armed Bandits

A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip

1 code implementation10 Jun 2021 Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor

We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.

A Continuized View on Nesterov Acceleration

no code implementations11 Feb 2021 Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Adrien Taylor

We introduce the "continuized" Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.

Distributed, Parallel, and Cluster Computing Optimization and Control

Online nonparametric regression with Sobolev kernels

no code implementations6 Feb 2021 Oleksandr Zadorozhnyi, Pierre Gaillard, Sebastien Gerschinovitz, Alessandro Rudi

In this work we investigate the variation of the online kernelized ridge regression algorithm in the setting of $d-$dimensional adversarial nonparametric regression.

regression

Non-stationary Online Regression

no code implementations13 Nov 2020 Anant Raj, Pierre Gaillard, Christophe Saad

To the best of our knowledge, this work is the first extension of non-stationary online regression to non-stationary kernel regression.

regression Time Series +1

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

no code implementations NeurIPS 2020 Raphaël Berthier, Francis Bach, Pierre Gaillard

In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, X \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs $U$.

Improved Sleeping Bandits with Stochastic Actions Sets and Adversarial Rewards

no code implementations14 Apr 2020 Aadirupa Saha, Pierre Gaillard, Michal Valko

We then study the most general version of the problem where at each round available sets are generated from some unknown arbitrary distribution (i. e., without the independence assumption) and propose an efficient algorithm with $O(\sqrt {2^K T})$ regret guarantee.

Efficient improper learning for online logistic regression

no code implementations18 Mar 2020 Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

We consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B.

regression

Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index

1 code implementation13 Mar 2020 Camila Fernandez, Chung Shue Chen, Pierre Gaillard, Alonso Silva

In this paper, we make an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen's additive regression model), parametric (Weibull AFT model), and machine learning models (Random Survival Forest, Gradient Boosting with Cox Proportional Hazards Loss, DeepSurv) through the concordance index on two different datasets (PBC and GBCSG2).

BIG-bench Machine Learning regression

Efficient online learning with kernels for adversarial large scale problems

1 code implementation NeurIPS 2019 Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

For $d$-dimensional inputs, we provide a (close to) optimal regret of order $O((\log n)^{d+1})$ with per-round time complexity and space complexity $O((\log n)^{2d})$.

regression

Uniform regret bounds over $R^d$ for the sequential linear regression problem with the square loss

no code implementations29 May 2018 Pierre Gaillard, Sébastien Gerchinovitz, Malo Huard, Gilles Stoltz

In the case of sequentially revealed features, we also derive an asymptotic regret bound of $d B^2 \ln T$ for any individual sequence of features and bounded observations.

regression

Efficient online algorithms for fast-rate regret bounds under sparsity

no code implementations NeurIPS 2018 Pierre Gaillard, Olivier Wintenberger

setting, we establish new risk bounds that are adaptive to the sparsity of the problem and to the regularity of the risk (ranging from a rate 1 / $\sqrt T$ for general convex risk to 1 /T for strongly convex risk).

Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations

1 code implementation22 May 2018 Raphaël Berthier, Francis Bach, Pierre Gaillard

We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live.

Denoising

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

no code implementations27 Feb 2017 Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz

We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information.

A Chaining Algorithm for Online Nonparametric Regression

no code implementations26 Feb 2015 Pierre Gaillard, Sébastien Gerchinovitz

We consider the problem of online nonparametric regression with arbitrary deterministic sequences.

Computational Efficiency regression

A Second-order Bound with Excess Losses

no code implementations10 Feb 2014 Pierre Gaillard, Gilles Stoltz, Tim van Erven

We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates.

Mirror Descent Meets Fixed Share (and feels no regret)

no code implementations NeurIPS 2012 Nicolò Cesa-Bianchi, Pierre Gaillard, Gabor Lugosi, Gilles Stoltz

Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension.

Cannot find the paper you are looking for? You can Submit a new open access paper.