Search Results for author: Dheeraj Nagaraj

Found 20 papers, 2 papers with code

A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health

no code implementations22 Feb 2024 Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe

Efforts to reduce maternal mortality rate, a key UN Sustainable Development target (SDG Target 3. 1), rely largely on preventative care programs to spread critical health information to high-risk populations.

Language Modelling

Towards a Pretrained Model for Restless Bandits via Multi-arm Generalization

no code implementations23 Oct 2023 Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe

Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective.

Multi-agent Reinforcement Learning Multi-Armed Bandits +1

Near Optimal Heteroscedastic Regression with Symbiotic Learning

no code implementations25 Jun 2023 Dheeraj Baby, Aniket Das, Dheeraj Nagaraj, Praneeth Netrapalli

Our work shows that we can estimate $\mathbf{w}^{*}$ in squared norm up to an error of $\tilde{O}\left(\|\mathbf{f}^{*}\|^2 \cdot \left(\frac{1}{n} + \left(\frac{d}{n}\right)^2\right)\right)$ and prove a matching lower bound (upto log factors).

Econometrics regression +2

Stochastic Re-weighted Gradient Descent via Distributionally Robust Optimization

no code implementations15 Jun 2023 Ramnath Kumar, Kushal Majmundar, Dheeraj Nagaraj, Arun Sai Suggala

We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample importance weighting.

Domain Adaptation Representation Learning +1

Indexability is Not Enough for Whittle: Improved, Near-Optimal Algorithms for Restless Bandits

1 code implementation31 Oct 2022 Abheek Ghosh, Dheeraj Nagaraj, Manish Jain, Milind Tambe

Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions.

Multi-Armed Bandits

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation

no code implementations12 Oct 2022 Gandharv Patil, Prashanth L. A., Dheeraj Nagaraj, Doina Precup

We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm when combined with tail-averaging.

Utilising the CLT Structure in Stochastic Gradient based Sampling : Improved Analysis and Faster Algorithms

no code implementations8 Jun 2022 Aniket Das, Dheeraj Nagaraj, Anant Raj

We consider stochastic approximations of sampling algorithms, such as Stochastic Gradient Langevin Dynamics (SGLD) and the Random Batch Method (RBM) for Interacting Particle Dynamcs (IPD).

Introspective Experience Replay: Look Back When Surprised

1 code implementation7 Jun 2022 Ramnath Kumar, Dheeraj Nagaraj

In reinforcement learning (RL), experience replay-based sampling techniques play a crucial role in promoting convergence by eliminating spurious correlations.

Q-Learning reinforcement-learning +1

Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs

no code implementations ICLR 2022 Naman Agarwal, Syomantak Chaudhuri, Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli

The starting point of our work is the observation that in practice, Q-learning is used with two important modifications: (i) training with two networks, called online network and target network simultaneously (online target learning, or OTL) , and (ii) experience replay (ER) (Mnih et al., 2015).

Q-Learning Reinforcement Learning (RL)

The staircase property: How hierarchical structure can guide deep learning

no code implementations NeurIPS 2021 Emmanuel Abbe, Enric Boix-Adsera, Matthew Brennan, Guy Bresler, Dheeraj Nagaraj

This paper identifies a structural property of data distributions that enables deep neural networks to learn hierarchically.

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems

no code implementations NeurIPS 2021 Prateek Jain, Suhas S Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli

In this work, we improve existing results for learning nonlinear systems in a number of ways: a) we provide the first offline algorithm that can learn non-linear dynamical systems without the mixing assumption, b) we significantly improve upon the sample complexity of existing results for mixing systems, c) in the much harder one-pass, streaming setting we study a SGD with Reverse Experience Replay ($\mathsf{SGD-RER}$) method, and demonstrate that for mixing systems, it achieves the same sample complexity as our offline algorithm, d) we justify the expansivity assumption by showing that for the popular ReLU link function -- a non-expansive but easy to learn link function with i. i. d.

Streaming Linear System Identification with Reverse Experience Replay

no code implementations NeurIPS 2021 Prateek Jain, Suhas S Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli

Thus, we provide the first -- to the best of our knowledge -- optimal SGD-style algorithm for the classical problem of linear system identification with a first order oracle.

Reinforcement Learning (RL) Time Series Analysis

A law of robustness for two-layers neural networks

no code implementations30 Sep 2020 Sébastien Bubeck, Yuanzhi Li, Dheeraj Nagaraj

We make a precise conjecture that, for any Lipschitz activation function and for most datasets, any two-layers neural network with $k$ neurons that perfectly fit the data must have its Lipschitz constant larger (up to a constant) than $\sqrt{n/k}$ where $n$ is the number of datapoints.

Vocal Bursts Valence Prediction

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

no code implementations NeurIPS 2020 Guy Bresler, Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli, Xian Wu

Our improved rate serves as one of the first results where an algorithm outperforms SGD-DD on an interesting Markov chain and also provides one of the first theoretical analyses to support the use of experience replay in practice.

regression

Sharp Representation Theorems for ReLU Networks with Precise Dependence on Depth

no code implementations NeurIPS 2020 Guy Bresler, Dheeraj Nagaraj

For each $D$, $\mathcal{G}_{D} \subseteq \mathcal{G}_{D+1}$ and as $D$ grows the class of functions $\mathcal{G}_{D}$ contains progressively less smooth functions.

A Corrective View of Neural Networks: Representation, Memorization and Learning

no code implementations1 Feb 2020 Guy Bresler, Dheeraj Nagaraj

This technique yields several new representation and learning results for neural networks.

Memorization

Making the Last Iterate of SGD Information Theoretically Optimal

no code implementations29 Apr 2019 Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli

While classical theoretical analysis of SGD for convex problems studies (suffix) \emph{averages} of iterates and obtains information theoretically optimal bounds on suboptimality, the \emph{last point} of SGD is, by far, the most preferred choice in practice.

SGD without Replacement: Sharper Rates for General Smooth Convex Functions

no code implementations4 Mar 2019 Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli

For {\em small} $K$, we show \sgdwor can achieve same convergence rate as \sgd for {\em general smooth strongly-convex} functions.

Optimal Single Sample Tests for Structured versus Unstructured Network Data

no code implementations17 Feb 2018 Guy Bresler, Dheeraj Nagaraj

We develop a new approach that applies to both the Ising and Exponential Random Graph settings based on a general and natural statistical test.

Cannot find the paper you are looking for? You can Submit a new open access paper.