Search Results for author: Pierre Gaillard

Found 34 papers, 7 papers with code

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

no code implementations • ICML 2020 • Aadirupa Saha, Pierre Gaillard, Michal Valko

The best existing efficient (i. e., polynomial-time) algorithms for this problem only guarantee a $O(T^{2/3})$ upper-bound on the regret.

Paper
Add Code

Stop Relying on No-Choice and Do not Repeat the Moves: Optimal, Efficient and Practical Algorithms for Assortment Optimization

no code implementations • 29 Feb 2024 • Aadirupa Saha, Pierre Gaillard

In this paper, we designed efficient algorithms for the problem of regret minimization in assortment selection with \emph{Plackett Luce} (PL) based user choices.

Recommendation Systems

Paper
Add Code

Covariance-Adaptive Least-Squares Algorithm for Stochastic Combinatorial Semi-Bandits

no code implementations • 23 Feb 2024 • Julien Zhou, Pierre Gaillard, Thibaud Rahier, Houssam Zenati, Julyan Arbel

We address the problem of stochastic combinatorial semi-bandits, where a player can select from P subsets of a set containing d base items.

Paper
Add Code

Online Learning Approach for Survival Analysis

no code implementations • 7 Feb 2024 • Camila Fernandez, Pierre Gaillard, Joseph de Vilmarest, Olivier Wintenberger

We introduce an online mathematical framework for survival analysis, allowing real time adaptation to dynamic environments and censored data.

Survival Analysis

Paper
Add Code

Efficient Model-Based Concave Utility Reinforcement Learning through Greedy Mirror Descent

no code implementations • 30 Nov 2023 • Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

Many machine learning tasks can be solved by minimizing a convex function of an occupancy measure over the policies that generate them.

Imitation Learning reinforcement-learning

Paper
Add Code

Adaptive approximation of monotone functions

no code implementations • 14 Sep 2023 • Pierre Gaillard, Sébastien Gerchinovitz, Étienne de Montbrun

We prove that GreedyBox achieves an optimal sample complexity for any function $f$, up to logarithmic factors.

Numerical Integration

Paper
Add Code

Sequential Counterfactual Risk Minimization

1 code implementation • 23 Feb 2023 • Houssam Zenati, Eustache Diemert, Matthieu Martin, Julien Mairal, Pierre Gaillard

Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data.

counterfactual

Paper
Code

Reimagining Demand-Side Management with Mean Field Learning

no code implementations • 16 Feb 2023 • Bianca Marin Moreno, Margaux Brégère, Pierre Gaillard, Nadia Oudjane

Integrating renewable energy into the power grid while balancing supply and demand is a complex issue, given its intermittent nature.

Management

Paper
Add Code

One Arrow, Two Kills: An Unified Framework for Achieving Optimal Regret Guarantees in Sleeping Bandits

no code implementations • 26 Oct 2022 • Pierre Gaillard, Aadirupa Saha, Soham Dan

We address the problem of \emph{`Internal Regret'} in \emph{Sleeping Bandits} in the fully adversarial setup, as well as draw connections between different existing notions of sleeping regrets in the multiarmed bandits (MAB) literature and consequently analyze the implications: Our first contribution is to propose the new notion of \emph{Internal Regret} for sleeping MAB.

Paper
Add Code

Versatile Dueling Bandits: Best-of-both-World Analyses for Online Learning from Preferences

no code implementations • 14 Feb 2022 • Aadirupa Saha, Pierre Gaillard

We study the problem of $K$-armed dueling bandit for both stochastic and adversarial environments, where the goal of the learner is to aggregate information through relative preferences of pair of decisions points queried in an online sequential manner.

Multi-Armed Bandits

Paper
Add Code

Efficient Kernel UCB for Contextual Bandits

1 code implementation • 11 Feb 2022 • Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard

While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems.

Computational Efficiency Multi-Armed Bandits

Paper
Code

Continuized Accelerations of Deterministic and Stochastic Gradient Descents, and of Gossip Algorithms

no code implementations • NeurIPS 2021 • Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Hadrien Hendrikx, Pierre Gaillard, Laurent Massoulié, Adrien Taylor

We introduce the ``continuized'' Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.

Paper
Add Code

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

no code implementations • NeurIPS 2021 • Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions.

Paper
Add Code

Mixability made efficient: Fast online multiclass logistic regression

no code implementations • NeurIPS 2021 • Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

Mixability has been shown to be a powerful tool to obtain algorithms with optimal regret.

regression

Paper
Add Code

Dueling Bandits with Adversarial Sleeping

no code implementations • NeurIPS 2021 • Aadirupa Saha, Pierre Gaillard

The goal is to find an optimal `no-regret' policy that can identify the best available item at each round, as opposed to the standard `fixed best-arm regret objective' of dueling bandits.

Management Multi-Armed Bandits

Paper
Add Code

A Continuized View on Nesterov Acceleration for Stochastic Gradient Descent and Randomized Gossip

1 code implementation • 10 Jun 2021 • Mathieu Even, Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Hadrien Hendrikx, Laurent Massoulié, Adrien Taylor

We introduce the continuized Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.

Paper
Code

A Continuized View on Nesterov Acceleration

no code implementations • 11 Feb 2021 • Raphaël Berthier, Francis Bach, Nicolas Flammarion, Pierre Gaillard, Adrien Taylor

We introduce the "continuized" Nesterov acceleration, a close variant of Nesterov acceleration whose variables are indexed by a continuous time parameter.

Distributed, Parallel, and Cluster Computing Optimization and Control

Paper
Add Code

Online nonparametric regression with Sobolev kernels

no code implementations • 6 Feb 2021 • Oleksandr Zadorozhnyi, Pierre Gaillard, Sebastien Gerschinovitz, Alessandro Rudi

In this work we investigate the variation of the online kernelized ridge regression algorithm in the setting of $d-$dimensional adversarial nonparametric regression.

regression

Paper
Add Code

Non-stationary Online Regression

no code implementations • 13 Nov 2020 • Anant Raj, Pierre Gaillard, Christophe Saad

To the best of our knowledge, this work is the first extension of non-stationary online regression to non-stationary kernel regression.

regression Time Series +1

Paper
Add Code

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

no code implementations • NeurIPS 2020 • Raphaël Berthier, Francis Bach, Pierre Gaillard

In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, X \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs $U$.

Paper
Add Code

Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation

1 code implementation • 22 Apr 2020 • Houssam Zenati, Alberto Bietti, Matthieu Martin, Eustache Diemert, Pierre Gaillard, Julien Mairal

Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare.

counterfactual Counterfactual Reasoning +1

Paper
Code

Improved Sleeping Bandits with Stochastic Actions Sets and Adversarial Rewards

no code implementations • 14 Apr 2020 • Aadirupa Saha, Pierre Gaillard, Michal Valko

We then study the most general version of the problem where at each round available sets are generated from some unknown arbitrary distribution (i. e., without the independence assumption) and propose an efficient algorithm with $O(\sqrt {2^K T})$ regret guarantee.

Paper
Add Code

Efficient improper learning for online logistic regression

no code implementations • 18 Mar 2020 • Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

We consider the setting of online logistic regression and consider the regret with respect to the 2-ball of radius B.

regression

Paper
Add Code

Experimental Comparison of Semi-parametric, Parametric, and Machine Learning Models for Time-to-Event Analysis Through the Concordance Index

1 code implementation • 13 Mar 2020 • Camila Fernandez, Chung Shue Chen, Pierre Gaillard, Alonso Silva

In this paper, we make an experimental comparison of semi-parametric (Cox proportional hazards model, Aalen's additive regression model), parametric (Weibull AFT model), and machine learning models (Random Survival Forest, Gradient Boosting with Cox Proportional Hazards Loss, DeepSurv) through the concordance index on two different datasets (PBC and GBCSG2).

BIG-bench Machine Learning regression

Paper
Code

Efficient online learning with kernels for adversarial large scale problems

1 code implementation • NeurIPS 2019 • Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

For $d$-dimensional inputs, we provide a (close to) optimal regret of order $O((\log n)^{d+1})$ with per-round time complexity and space complexity $O((\log n)^{2d})$.

regression

Paper
Code

Target Tracking for Contextual Bandits: Application to Demand Side Management

no code implementations • 28 Jan 2019 • Margaux Brégère, Pierre Gaillard, Yannig Goude, Gilles Stoltz

We propose a contextual-bandit approach for demand side management by offering price incentives.

Management Multi-Armed Bandits

Paper
Add Code

Uniform regret bounds over $R^d$ for the sequential linear regression problem with the square loss

no code implementations • 29 May 2018 • Pierre Gaillard, Sébastien Gerchinovitz, Malo Huard, Gilles Stoltz

In the case of sequentially revealed features, we also derive an asymptotic regret bound of $d B^2 \ln T$ for any individual sequence of features and bounded observations.

regression

Paper
Add Code

Efficient online algorithms for fast-rate regret bounds under sparsity

no code implementations • NeurIPS 2018 • Pierre Gaillard, Olivier Wintenberger

setting, we establish new risk bounds that are adaptive to the sparsity of the problem and to the regularity of the risk (ranging from a rate 1 / $\sqrt T$ for general convex risk to 1 /T for strongly convex risk).

Paper
Add Code

Accelerated Gossip in Networks of Given Dimension using Jacobi Polynomial Iterations

1 code implementation • 22 May 2018 • Raphaël Berthier, Francis Bach, Pierre Gaillard

We develop a method solving the gossip problem that depends only on the spectral dimension of the network, that is, in the communication network set-up, the dimension of the space in which the agents live.

Denoising

Paper
Code

Algorithmic Chaining and the Role of Partial Feedback in Online Nonparametric Learning

no code implementations • 27 Feb 2017 • Nicolò Cesa-Bianchi, Pierre Gaillard, Claudio Gentile, Sébastien Gerchinovitz

We investigate contextual online learning with nonparametric (Lipschitz) comparison classes under different assumptions on losses and feedback information.

Paper
Add Code

A Chaining Algorithm for Online Nonparametric Regression

no code implementations • 26 Feb 2015 • Pierre Gaillard, Sébastien Gerchinovitz

We consider the problem of online nonparametric regression with arbitrary deterministic sequences.

Computational Efficiency regression

Paper
Add Code

A consistent deterministic regression tree for non-parametric prediction of time series

no code implementations • 7 May 2014 • Pierre Gaillard, Paul Baudin

We study online prediction of bounded stationary ergodic processes.

regression Time Series +1

Paper
Add Code

A Second-order Bound with Excess Losses

no code implementations • 10 Feb 2014 • Pierre Gaillard, Gilles Stoltz, Tim van Erven

We study online aggregation of the predictions of experts, and first show new second-order regret bounds in the standard setting, which are obtained via a version of the Prod algorithm (and also a version of the polynomially weighted average algorithm) with multiple learning rates.

Paper
Add Code

Mirror Descent Meets Fixed Share (and feels no regret)

no code implementations • NeurIPS 2012 • Nicolò Cesa-Bianchi, Pierre Gaillard, Gabor Lugosi, Gilles Stoltz

Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.