Search Results for author: Nishant A. Mehta

Found 15 papers, 1 papers with code

On the price of exact truthfulness in incentive-compatible online learning with bandit feedback: A regret lower bound for WSU-UX

no code implementations8 Apr 2024 Ali Mortazavi, Junhao Lin, Nishant A. Mehta

In this work, our goal is to design an algorithm for the selfish experts problem that is incentive-compatible (IC, or \emph{truthful}), meaning each expert's best strategy is to report truthfully, while also ensuring the algorithm enjoys sublinear regret with respect to the expert with the best belief.

Near-optimal Per-Action Regret Bounds for Sleeping Bandits

no code implementations2 Mar 2024 Quan Nguyen, Nishant A. Mehta

In a setting with $K$ total arms and at most $A$ available arms in each round over $T$ rounds, the best known upper bound is $O(K\sqrt{TA\ln{K}})$, obtained indirectly via minimizing internal sleeping regrets.

An improved regret analysis for UCB-N and TS-N

no code implementations6 May 2023 Nishant A. Mehta

In the setting of stochastic online learning with undirected feedback graphs, Lykouris et al. (2020) previously analyzed the pseudo-regret of the upper confidence bound-based algorithm UCB-N and the Thompson Sampling-based algorithm TS-N.

LEMMA Thompson Sampling

Adversarial Online Multi-Task Reinforcement Learning

1 code implementation11 Jan 2023 Quan Nguyen, Nishant A. Mehta

We prove a minimax lower bound of $\Omega(K\sqrt{DSAH})$ on the regret of any learning algorithm and an instance-specific lower bound of $\Omega(\frac{K}{\lambda^2})$ in sample complexity for a class of uniformly-good cluster-then-learn algorithms.

reinforcement-learning Reinforcement Learning (RL)

Best-Case Lower Bounds in Online Learning

no code implementations NeurIPS 2021 Cristóbal Guzmán, Nishant A. Mehta, Ali Mortazavi

Much of the work in online learning focuses on the study of sublinear upper bounds on the regret.

Fairness

Near-Optimal Algorithms for Private Online Learning in a Stochastic Environment

no code implementations16 Feb 2021 Bingshan Hu, Zhiming Huang, Nishant A. Mehta

Specifically, for the problem of decision-theoretic online learning with stochastic rewards, we present the first algorithm that achieves an $ O \left( \frac{ \log K}{ \Delta_{\min}} + \frac{\log(K) \min\{\log (\frac{1}{\Delta_{\min}}), \log(T)\}}{\epsilon} \right)$ regret bound, where $\Delta_{\min}$ is the minimum mean reward gap.

A Farewell to Arms: Sequential Reward Maximization on a Budget with a Giving Up Option

no code implementations6 Mar 2020 P Sharoff, Nishant A. Mehta, Ravi Ganti

We consider a sequential decision-making problem where an agent can take one action at a time and each action has a stochastic temporal extent, i. e., a new action cannot be taken until the previous one is finished.

Decision Making Multi-Armed Bandits

Dying Experts: Efficient Algorithms with Optimal Regret Bounds

no code implementations NeurIPS 2019 Hamid Shayestehmanesh, Sajjad Azami, Nishant A. Mehta

In both cases, we provide matching upper and lower bounds on the ranking regret in the fully adversarial setting.

Multi-Observation Regression

no code implementations27 Feb 2018 Rafael Frongillo, Nishant A. Mehta, Tom Morgan, Bo Waggoner

Recent work introduced loss functions which measure the error of a prediction based on multiple simultaneous observations or outcomes.

regression

CompAdaGrad: A Compressed, Complementary, Computationally-Efficient Adaptive Gradient Method

no code implementations12 Sep 2016 Nishant A. Mehta, Alistair Rendell, Anish Varghese, Christfried Webers

The adaptive gradient online learning method known as AdaGrad has seen widespread use in the machine learning community in stochastic and adversarial online learning problems and more recently in deep learning methods.

Fast rates with high probability in exp-concave statistical learning

no code implementations4 May 2016 Nishant A. Mehta

We present an algorithm for the statistical learning setting with a bounded exp-concave loss in $d$ dimensions that obtains excess risk $O(d \log(1/\delta)/n)$ with probability at least $1 - \delta$.

Model Selection Vocal Bursts Intensity Prediction

Fast Rates for General Unbounded Loss Functions: from ERM to Generalized Bayes

no code implementations1 May 2016 Peter D. Grünwald, Nishant A. Mehta

For general loss functions, our bounds rely on two separate conditions: the $v$-GRIP (generalized reversed information projection) conditions, which control the lower tail of the excess loss; and the newly introduced witness condition, which controls the upper tail.

Bayesian Inference

Fast rates in statistical and online learning

no code implementations9 Jul 2015 Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson

For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning.

Density Estimation Learning Theory

From Stochastic Mixability to Fast Rates

no code implementations NeurIPS 2014 Nishant A. Mehta, Robert C. Williamson

In the non-statistical prediction with expert advice setting, there is an analogous slow and fast rate phenomenon, and it is entirely characterized in terms of the mixability of the loss $\ell$ (there being no role there for $\mathcal{F}$ or $\mathsf{P}$).

Cannot find the paper you are looking for? You can Submit a new open access paper.