Search Results for author: Ness Shroff

Found 24 papers, 1 papers with code

Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate

no code implementations • 21 Feb 2024 • Yuchen Liang, Peizhong Ju, Yingbin Liang, Ness Shroff

In this paper, we establish the convergence guarantee for substantially larger classes of distributions under discrete-time diffusion models and further improve the convergence rate for distributions with bounded support.

Denoising

Paper
Add Code

Model-Free Change Point Detection for Mixing Processes

no code implementations • 14 Dec 2023 • Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

In particular, we provide performance guarantees for the MMD-CUSUM test under $\alpha$, $\beta$, and $\phi$-mixing processes, which significantly expands its utility beyond the i. i. d.

Change Point Detection

Paper
Add Code

Hoeffding's Inequality for Markov Chains under Generalized Concentrability Condition

no code implementations • 4 Oct 2023 • Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM).

Paper
Add Code

Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping

no code implementations • 22 Jun 2023 • Yining Li, Peizhong Ju, Ness Shroff

To address this issue, we formulate a general optimization problem for determining the optimal grouping strategy, which strikes a balance between performance loss and sample/computational complexity.

Paper
Add Code

Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

no code implementations • 14 Jun 2023 • Ming Shi, Yingbin Liang, Ness Shroff

However, existing theoretical results have shown that learning in POMDPs is intractable in the worst case, where the main challenge lies in the lack of latent state information.

Paper
Add Code

Provably Efficient Model-Free Algorithms for Non-stationary CMDPs

no code implementations • 10 Mar 2023 • Honghao Wei, Arnob Ghosh, Ness Shroff, Lei Ying, Xingyu Zhou

We study model-free reinforcement learning (RL) algorithms in episodic non-stationary constrained Markov Decision Processes (CMDPs), in which an agent aims to maximize the expected cumulative reward subject to a cumulative constraint on the expected utility (cost).

Reinforcement Learning (RL)

Paper
Add Code

Theory on Forgetting and Generalization of Continual Learning

no code implementations • 12 Feb 2023 • Sen Lin, Peizhong Ju, Yingbin Liang, Ness Shroff

In particular, there is a lack of understanding on what factors are important and how they affect "catastrophic forgetting" and generalization performance.

Continual Learning

Paper
Add Code

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

no code implementations • 8 Feb 2023 • Ming Shi, Yingbin Liang, Ness Shroff

In many applications of Reinforcement Learning (RL), it is critically important that the algorithm performs safely, such that instantaneous hard constraints are satisfied at each step, and unsafe states and actions are avoided.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Near-Optimal Adversarial Reinforcement Learning with Switching Costs

no code implementations • 8 Feb 2023 • Ming Shi, Yingbin Liang, Ness Shroff

Our lower bound indicates that, due to the fundamental challenge of switching costs in adversarial RL, the best achieved regret (whose dependency on $T$ is $\tilde{O}(\sqrt{T})$) in static RL with switching costs (as well as adversarial RL without switching costs) is no longer achievable.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

no code implementations • 23 Jun 2022 • Arnob Ghosh, Xingyu Zhou, Ness Shroff

To this end, we consider the episodic constrained Markov decision processes with linear function approximation, where the transition dynamics and the reward function can be represented as a linear function of some known feature mapping.

Paper
Add Code

Sample Complexity Bounds for Active Ranking from Multi-wise Comparisons

1 code implementation • NeurIPS 2021 • Wenbo Ren, Jia Liu, Ness Shroff

Here, a multi-wise comparison takes $m$ items as input and returns a (noisy) result about the best item (the winner feedback) or the order of these items (the full-ranking feedback).

Paper
Code

Adaptive Control of Differentially Private Linear Quadratic Systems

no code implementations • 26 Aug 2021 • Sayak Ray Chowdhury, Xingyu Zhou, Ness Shroff

In this paper, we study the problem of regret minimization in reinforcement learning (RL) under differential privacy constraints.

Reinforcement Learning (RL)

Paper
Add Code

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations • 6 Jul 2021 • Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

regression

Paper
Add Code

No-Regret Algorithms for Time-Varying Bayesian Optimization

no code implementations • 11 Feb 2021 • Xingyu Zhou, Ness Shroff

In this paper, we consider the time-varying Bayesian optimization problem.

Bayesian Optimization

Paper
Add Code

Bandit Policies for Reliable Cellular Network Handovers in Extreme Mobility

no code implementations • 28 Oct 2020 • Yuanjie Li, Esha Datta, Jiaxin Ding, Ness Shroff, Xin Liu

The demand for seamless Internet access under extreme user mobility, such as on high-speed trains and vehicles, has become a norm rather than an exception.

Thompson Sampling

Paper
Add Code

Multi-Armed Bandits with Dependent Arms

no code implementations • 13 Oct 2020 • Rahul Singh, Fang Liu, Yin Sun, Ness Shroff

We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms.

Multi-Armed Bandits

Paper
Add Code

Contextual Bandits with Side-Observations

no code implementations • 6 Jun 2020 • Rahul Singh, Fang Liu, Xin Liu, Ness Shroff

We show that this asymptotically optimal regret is upper-bounded as $O\left(|\chi(\mathcal{G})|\log T\right)$, where $|\chi(\mathcal{G})|$ is the domination number of $\mathcal{G}$.

Multi-Armed Bandits

Paper
Add Code

Data Poisoning Attacks on Stochastic Bandits

no code implementations • 16 May 2019 • Fang Liu, Ness Shroff

Then we study a form of online attacks on bandit algorithms and propose an adaptive attack strategy against any bandit algorithm without the knowledge of the bandit algorithm.

Data Poisoning Multi-Armed Bandits +1

Paper
Add Code

Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits

no code implementations • 28 Oct 2018 • Wenbo Ren, Jia Liu, Ness Shroff

Results in this paper provide up to $\rho n/k$ reductions compared with the "$k$-exploration" algorithms that focus on finding the (PAC) best $k$ arms out of $n$ arms.

Paper
Add Code

Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

no code implementations • 23 May 2018 • Fang Liu, Zizhan Zheng, Ness Shroff

To fill this gap, we propose a variant of Thompson Sampling, that attains the optimal regret in the directed setting within a logarithmic factor.

Thompson Sampling

Paper
Add Code

UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits

no code implementations • 16 Apr 2018 • Fang Liu, Sinong Wang, Swapna Buccapatnam, Ness Shroff

We show that UCBoost($D$) enjoys $O(1)$ complexity for each arm per round as well as regret guarantee that is $1/e$-close to that of the kl-UCB algorithm.

Decision Making

Paper
Add Code

A New Alternating Direction Method for Linear Programming

no code implementations • NeurIPS 2017 • Sinong Wang, Ness Shroff

It is well known that, for a linear program (LP) with constraint matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$, the Alternating Direction Method of Multiplier converges globally and linearly at a rate $O((\|\mathbf{A}\|_F^2+mn)\log(1/\epsilon))$.

Paper
Add Code

Information Directed Sampling for Stochastic Bandits with Graph Feedback

no code implementations • 8 Nov 2017 • Fang Liu, Swapna Buccapatnam, Ness Shroff

We consider stochastic multi-armed bandit problems with graph feedback, where the decision maker is allowed to observe the neighboring actions of the chosen action.

Decision Making Thompson Sampling

Paper
Add Code

A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem

no code implementations • 8 Nov 2017 • Fang Liu, Joohyun Lee, Ness Shroff

The multi-armed bandit problem has been extensively studied under the stationary assumption.

Change Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.