Search Results for author: Pierre Perrault

Found 9 papers, 1 papers with code

Budgeted Online Influence Maximization

no code implementations • ICML 2020 • Pierre Perrault, Zheng Wen, Michal Valko, Jennifer Healey

We introduce a new budgeted framework for online influence maximization, considering the total cost of an advertising campaign instead of the common cardinality constraint on a chosen influencer set.

valid

Paper
Add Code

Demonstration-Regularized RL

no code implementations • 26 Oct 2023 • Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Menard

In particular, we study the demonstration-regularized reinforcement learning that leverages the expert demonstrations by KL-regularization for a policy learned by behavior cloning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Fast Rates for Maximum Entropy Exploration

1 code implementation • 14 Mar 2023 • Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Menard

Finally, we apply developed regularization techniques to reduce sample complexity of visitation entropy maximization to $\widetilde{\mathcal{O}}(H^2SA/\varepsilon^2)$, yielding a statistical separation between maximum entropy exploration and reward-free exploration.

Reinforcement Learning (RL)

Paper
Code

When Combinatorial Thompson Sampling meets Approximation Regret

no code implementations • 22 Feb 2023 • Pierre Perrault

We provide the first $\mathcal{O}(\log(T)/\Delta)$ approximation regret upper bound for CTS, obtained under a specific condition on the approximation oracle, allowing a reduction to the exact oracle analysis.

Thompson Sampling

Paper
Add Code

On the Approximation Relationship between Optimizing Ratio of Submodular (RS) and Difference of Submodular (DS) Functions

no code implementations • 5 Jan 2021 • Pierre Perrault, Jennifer Healey, Zheng Wen, Michal Valko

We demonstrate that from an algorithm guaranteeing an approximation factor for the ratio of submodular (RS) optimization problem, we can build another algorithm having a different kind of approximation guarantee -- weaker than the classical one -- for the difference of submodular (DS) optimization problem, and vice versa.

Data Structures and Algorithms

Paper
Add Code

Statistical Efficiency of Thompson Sampling for Combinatorial Semi-Bandits

no code implementations • NeurIPS 2020 • Pierre Perrault, Etienne Boursier, Vianney Perchet, Michal Valko

In CMAB, the question of the existence of an efficient policy with an optimal asymptotic regret (up to a factor poly-logarithmic with the action size) is still open for many families of distributions, including mutually independent outcomes, and more generally the multivariate sub-Gaussian family.

Thompson Sampling

Paper
Add Code

Online A-Optimal Design and Active Linear Regression

no code implementations • 20 Jun 2019 • Xavier Fontaine, Pierre Perrault, Michal Valko, Vianney Perchet

By trying to minimize the $\ell^2$-loss $\mathbb{E} [\lVert\hat{\beta}-\beta^{\star}\rVert^2]$ the decision maker is actually minimizing the trace of the covariance matrix of the problem, which corresponds then to online A-optimal design.

regression

Paper
Add Code

Exploiting Structure of Uncertainty for Efficient Matroid Semi-Bandits

no code implementations • 11 Feb 2019 • Pierre Perrault, Vianney Perchet, Michal Valko

We improve the efficiency of algorithms for stochastic \emph{combinatorial semi-bandits}.

Paper
Add Code

Finding the bandit in a graph: Sequential search-and-stop

no code implementations • 6 Jun 2018 • Pierre Perrault, Vianney Perchet, Michal Valko

We consider the problem where an agent wants to find a hidden object that is randomly located in some vertex of a directed acyclic graph (DAG) according to a fixed but possibly unknown distribution.

Multi-Armed Bandits

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.