Search Results for author: Nishant A. Mehta

Found 15 papers, 1 papers with code

On the price of exact truthfulness in incentive-compatible online learning with bandit feedback: A regret lower bound for WSU-UX

no code implementations • 8 Apr 2024 • Ali Mortazavi, Junhao Lin, Nishant A. Mehta

In this work, our goal is to design an algorithm for the selfish experts problem that is incentive-compatible (IC, or \emph{truthful}), meaning each expert's best strategy is to report truthfully, while also ensuring the algorithm enjoys sublinear regret with respect to the expert with the best belief.

Paper
Add Code

Near-optimal Per-Action Regret Bounds for Sleeping Bandits

no code implementations • 2 Mar 2024 • Quan Nguyen, Nishant A. Mehta

In a setting with $K$ total arms and at most $A$ available arms in each round over $T$ rounds, the best known upper bound is $O(K\sqrt{TA\ln{K}})$, obtained indirectly via minimizing internal sleeping regrets.

Paper
Add Code

An improved regret analysis for UCB-N and TS-N

no code implementations • 6 May 2023 • Nishant A. Mehta

In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N.

LEMMA Thompson Sampling

Paper
Add Code

Adversarial Online Multi-Task Reinforcement Learning

1 code implementation • 11 Jan 2023 • Quan Nguyen, Nishant A. Mehta

We prove a minimax lower bound of $\Omega(K\sqrt{DSAH})$ on the regret of any learning algorithm and an instance-specific lower bound of $\Omega(\frac{K}{\lambda^2})$ in sample complexity for a class of uniformly-good cluster-then-learn algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Best-Case Lower Bounds in Online Learning

no code implementations • NeurIPS 2021 • Cristóbal Guzmán, Nishant A. Mehta, Ali Mortazavi

Much of the work in online learning focuses on the study of sublinear upper bounds on the regret.

Fairness

Paper
Add Code

Near-Optimal Algorithms for Private Online Learning in a Stochastic Environment

no code implementations • 16 Feb 2021 • Bingshan Hu, Zhiming Huang, Nishant A. Mehta

Specifically, for the problem of decision-theoretic online learning with stochastic rewards, we present the first algorithm that achieves an $ O \left( \frac{ \log K}{ \Delta_{\min}} + \frac{\log(K) \min\{\log (\frac{1}{\Delta_{\min}}), \log(T)\}}{\epsilon} \right)$ regret bound, where $\Delta_{\min}$ is the minimum mean reward gap.

Paper
Add Code

A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option

no code implementations • 6 Mar 2020 • P Sharoff, Nishant A. Mehta, Ravi Ganti

We consider a sequential decision-making problem where an agent can take one action at a time and each action has a stochastic temporal extent, i. e., a new action cannot be taken until the previous one is finished.

Decision Making Multi-Armed Bandits

Paper
Add Code

Dying Experts: Efficient Algorithms with Optimal Regret Bounds

no code implementations • NeurIPS 2019 • Hamid Shayestehmanesh, Sajjad Azami, Nishant A. Mehta

In both cases, we provide matching upper and lower bounds on the ranking regret in the fully adversarial setting.

Paper
Add Code

Multi-Observation Regression

no code implementations • 27 Feb 2018 • Rafael Frongillo, Nishant A. Mehta, Tom Morgan, Bo Waggoner

Recent work introduced loss functions which measure the error of a prediction based on multiple simultaneous observations or outcomes.

regression

Paper
Add Code

A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

no code implementations • 21 Oct 2017 • Peter D. Grünwald, Nishant A. Mehta

Our first main result bounds excess risk in terms of the new complexity.

Learning Theory

Paper
Add Code

CompAdaGrad: A Compressed, Complementary, Computationally-Efficient Adaptive Gradient Method

no code implementations • 12 Sep 2016 • Nishant A. Mehta, Alistair Rendell, Anish Varghese, Christfried Webers

The adaptive gradient online learning method known as AdaGrad has seen widespread use in the machine learning community in stochastic and adversarial online learning problems and more recently in deep learning methods.

Paper
Add Code

Fast rates with high probability in exp-concave statistical learning

no code implementations • 4 May 2016 • Nishant A. Mehta

We present an algorithm for the statistical learning setting with a bounded exp-concave loss in $d$ dimensions that obtains excess risk $O(d \log(1/\delta)/n)$ with probability at least $1 - \delta$.

Model Selection Vocal Bursts Intensity Prediction

Paper
Add Code

Fast Rates for General Unbounded Loss Functions: from ERM to Generalized Bayes

no code implementations • 1 May 2016 • Peter D. Grünwald, Nishant A. Mehta

For general loss functions, our bounds rely on two separate conditions: the $v$-GRIP (generalized reversed information projection) conditions, which control the lower tail of the excess loss; and the newly introduced witness condition, which controls the upper tail.

Bayesian Inference

Paper
Add Code

Fast rates in statistical and online learning

no code implementations • 9 Jul 2015 • Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson

For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning.

Density Estimation Learning Theory

Paper
Add Code

From Stochastic Mixability to Fast Rates

no code implementations • NeurIPS 2014 • Nishant A. Mehta, Robert C. Williamson

In the non-statistical prediction with expert advice setting, there is an analogous slow and fast rate phenomenon, and it is entirely characterized in terms of the mixability of the loss $\ell$ (there being no role there for $\mathcal{F}$ or $\mathsf{P}$).

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.