Search Results for author: Dirk van der Hoeven

Found 15 papers, 0 papers with code

High-Probability Risk Bounds via Sequential Predictors

no code implementations15 Aug 2023 Dirk van der Hoeven, Nikita Zhivotovskiy, Nicolò Cesa-Bianchi

Online learning methods yield sequential regret bounds under minimal assumptions and provide in-expectation risk bounds for statistical learning.

Density Estimation regression

Delayed Bandits: When Do Intermediate Observations Help?

no code implementations30 May 2023 Emmanuel Esposito, Saeed Masoudian, Hao Qiu, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

However, if the mapping of states to losses is stochastic, we show that the regret grows at a rate of $\sqrt{\big(K+\min\{|\mathcal{S}|, d\}\big)T}$ (within log factors), implying that if the number $|\mathcal{S}|$ of states is smaller than the delay, then intermediate observations help.

Learning on the Edge: Online Learning with Stochastic Feedback Graphs

no code implementations9 Oct 2022 Emmanuel Esposito, Federico Fusco, Dirk van der Hoeven, Nicolò Cesa-Bianchi

The framework of feedback graphs is a generalization of sequential decision-making with bandit or full information feedback.

Decision Making

A Regret-Variance Trade-Off in Online Learning

no code implementations6 Jun 2022 Dirk van der Hoeven, Nikita Zhivotovskiy, Nicolò Cesa-Bianchi

We prove that a variant of EWA either achieves a negative regret (i. e., the algorithm outperforms the best expert), or guarantees a $O(\log K)$ bound on both variance and regret.

Model Selection

A Near-Optimal Best-of-Both-Worlds Algorithm for Online Learning with Feedback Graphs

no code implementations1 Jun 2022 Chloé Rouyer, Dirk van der Hoeven, Nicolò Cesa-Bianchi, Yevgeny Seldin

The algorithm combines ideas from the EXP3++ algorithm for stochastic and adversarial bandits and the EXP3. G algorithm for feedback graphs with a novel exploration scheme.

Decision Making

Nonstochastic Bandits and Experts with Arm-Dependent Delays

no code implementations2 Nov 2021 Dirk van der Hoeven, Nicolò Cesa-Bianchi

We study nonstochastic bandits and experts in a delayed setting where delays depend on both time and arms.

Beyond Bandit Feedback in Online Multiclass Classification

no code implementations NeurIPS 2021 Dirk van der Hoeven, Federico Fusco, Nicolò Cesa-Bianchi

We study the problem of online multiclass classification in a setting where the learner's feedback is determined by an arbitrary directed graph.

2k Classification

Distributed Online Learning for Joint Regret with Communication Constraints

no code implementations15 Feb 2021 Dirk van der Hoeven, Hédi Hadiji, Tim van Erven

Each round, an adversary first activates one of the agents to issue a prediction and provides a corresponding gradient, and then the agents are allowed to send a $b$-bit message to their neighbors in the graph.

MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

no code implementations12 Feb 2021 Tim van Erven, Wouter M. Koolen, Dirk van der Hoeven

We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature.

Exploiting the Surrogate Gap in Online Multiclass Classification

no code implementations NeurIPS 2020 Dirk van der Hoeven

In the bandit classification setting we show that Gaptron is the first linear time algorithm with $O(K\sqrt{T})$ expected regret, where $K$ is the number of classes.

Classification General Classification

Comparator-adaptive Convex Bandits

no code implementations NeurIPS 2020 Dirk van der Hoeven, Ashok Cutkosky, Haipeng Luo

We study bandit convex optimization methods that adapt to the norm of the comparator, a topic that has only been studied before for its full-information counterpart.

User-Specified Local Differential Privacy in Unconstrained Adaptive Online Learning

no code implementations NeurIPS 2019 Dirk van der Hoeven

In this paper we generalize this approach by allowing the provider of the data to choose the distribution of the noise without disclosing any parameters of the distribution to the learner, under the constraint that the distribution is symmetrical.

The Many Faces of Exponential Weights in Online Learning

no code implementations21 Feb 2018 Dirk van der Hoeven, Tim van Erven, Wojciech Kotłowski

A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods.

Second-order methods

Cannot find the paper you are looking for? You can Submit a new open access paper.