no code implementations • 22 Feb 2024 • Nikhil Behari, Edwin Zhang, Yunfan Zhao, Aparna Taneja, Dheeraj Nagaraj, Milind Tambe
Efforts to reduce maternal mortality rate, a key UN Sustainable Development target (SDG Target 3. 1), rely largely on preventative care programs to spread critical health information to high-risk populations.
no code implementations • 23 Oct 2023 • Yunfan Zhao, Nikhil Behari, Edward Hughes, Edwin Zhang, Dheeraj Nagaraj, Karl Tuyls, Aparna Taneja, Milind Tambe
Restless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective.
no code implementations • 25 Jun 2023 • Dheeraj Baby, Aniket Das, Dheeraj Nagaraj, Praneeth Netrapalli
Our work shows that we can estimate $\mathbf{w}^{*}$ in squared norm up to an error of $\tilde{O}\left(\|\mathbf{f}^{*}\|^2 \cdot \left(\frac{1}{n} + \left(\frac{d}{n}\right)^2\right)\right)$ and prove a matching lower bound (upto log factors).
no code implementations • 15 Jun 2023 • Ramnath Kumar, Kushal Majmundar, Dheeraj Nagaraj, Arun Sai Suggala
We present Re-weighted Gradient Descent (RGD), a novel optimization technique that improves the performance of deep neural networks through dynamic sample importance weighting.
1 code implementation • 31 Oct 2022 • Abheek Ghosh, Dheeraj Nagaraj, Manish Jain, Milind Tambe
Whittle index policies, which are based on Lagrangian relaxations, are widely used in these settings due to their simplicity and near-optimality under certain conditions.
no code implementations • 12 Oct 2022 • Gandharv Patil, Prashanth L. A., Dheeraj Nagaraj, Doina Precup
We study the finite-time behaviour of the popular temporal difference (TD) learning algorithm when combined with tail-averaging.
no code implementations • 11 Oct 2022 • Naman Agarwal, Prateek Jain, Suhas Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli
In this work, we consider the problem of collaborative multi-user reinforcement learning.
no code implementations • 8 Jun 2022 • Aniket Das, Dheeraj Nagaraj, Anant Raj
We consider stochastic approximations of sampling algorithms, such as Stochastic Gradient Langevin Dynamics (SGLD) and the Random Batch Method (RBM) for Interacting Particle Dynamcs (IPD).
1 code implementation • 7 Jun 2022 • Ramnath Kumar, Dheeraj Nagaraj
In reinforcement learning (RL), experience replay-based sampling techniques play a crucial role in promoting convergence by eliminating spurious correlations.
no code implementations • ICLR 2022 • Naman Agarwal, Syomantak Chaudhuri, Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli
The starting point of our work is the observation that in practice, Q-learning is used with two important modifications: (i) training with two networks, called online network and target network simultaneously (online target learning, or OTL) , and (ii) experience replay (ER) (Mnih et al., 2015).
no code implementations • NeurIPS 2021 • Emmanuel Abbe, Enric Boix-Adsera, Matthew Brennan, Guy Bresler, Dheeraj Nagaraj
This paper identifies a structural property of data distributions that enables deep neural networks to learn hierarchically.
no code implementations • NeurIPS 2021 • Prateek Jain, Suhas S Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli
In this work, we improve existing results for learning nonlinear systems in a number of ways: a) we provide the first offline algorithm that can learn non-linear dynamical systems without the mixing assumption, b) we significantly improve upon the sample complexity of existing results for mixing systems, c) in the much harder one-pass, streaming setting we study a SGD with Reverse Experience Replay ($\mathsf{SGD-RER}$) method, and demonstrate that for mixing systems, it achieves the same sample complexity as our offline algorithm, d) we justify the expansivity assumption by showing that for the popular ReLU link function -- a non-expansive but easy to learn link function with i. i. d.
no code implementations • NeurIPS 2021 • Prateek Jain, Suhas S Kowshik, Dheeraj Nagaraj, Praneeth Netrapalli
Thus, we provide the first -- to the best of our knowledge -- optimal SGD-style algorithm for the classical problem of linear system identification with a first order oracle.
no code implementations • 30 Sep 2020 • Sébastien Bubeck, Yuanzhi Li, Dheeraj Nagaraj
We make a precise conjecture that, for any Lipschitz activation function and for most datasets, any two-layers neural network with $k$ neurons that perfectly fit the data must have its Lipschitz constant larger (up to a constant) than $\sqrt{n/k}$ where $n$ is the number of datapoints.
no code implementations • NeurIPS 2020 • Guy Bresler, Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli, Xian Wu
Our improved rate serves as one of the first results where an algorithm outperforms SGD-DD on an interesting Markov chain and also provides one of the first theoretical analyses to support the use of experience replay in practice.
no code implementations • NeurIPS 2020 • Guy Bresler, Dheeraj Nagaraj
For each $D$, $\mathcal{G}_{D} \subseteq \mathcal{G}_{D+1}$ and as $D$ grows the class of functions $\mathcal{G}_{D}$ contains progressively less smooth functions.
no code implementations • 1 Feb 2020 • Guy Bresler, Dheeraj Nagaraj
This technique yields several new representation and learning results for neural networks.
no code implementations • 29 Apr 2019 • Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli
While classical theoretical analysis of SGD for convex problems studies (suffix) \emph{averages} of iterates and obtains information theoretically optimal bounds on suboptimality, the \emph{last point} of SGD is, by far, the most preferred choice in practice.
no code implementations • 4 Mar 2019 • Prateek Jain, Dheeraj Nagaraj, Praneeth Netrapalli
For {\em small} $K$, we show \sgdwor can achieve same convergence rate as \sgd for {\em general smooth strongly-convex} functions.
no code implementations • 17 Feb 2018 • Guy Bresler, Dheeraj Nagaraj
We develop a new approach that applies to both the Ising and Exponential Random Graph settings based on a general and natural statistical test.