no code implementations • 9 Jun 2023 • Eric Luxenberg, Dhruv Malik, Yuanzhi Li, Aarti Singh, Stephen Boyd
We consider robust empirical risk minimization (ERM), where model parameters are chosen to minimize the worst-case empirical loss when each data point varies over a given convex uncertainty set.
no code implementations • 4 May 2023 • Dhruv Malik, Conor Igoe, Yuanzhi Li, Aarti Singh
Motivated by this, a significant line of work has formalized settings where an action's loss is a function of the number of times that action was recently played in the prior $m$ timesteps, where $m$ corresponds to a bound on human memory capacity.
no code implementations • 24 Apr 2022 • Dhruv Malik, Yuanzhi Li, Aarti Singh
Policy regret is a well established notion of measuring the performance of an online learning algorithm against an adaptive adversary.
no code implementations • 15 Jun 2021 • Dhruv Malik, Aldo Pacchiano, Vishwak Srinivasan, Yuanzhi Li
Reinforcement learning (RL) is empirically successful in complex nonlinear Markov decision processes (MDPs) with continuous state spaces.
no code implementations • NeurIPS 2021 • Dhruv Malik, Yuanzhi Li, Pradeep Ravikumar
Agents trained by reinforcement learning (RL) often fail to generalize beyond the environment they were trained in, even when presented with new scenarios that seem similar to the training environment.
no code implementations • 20 Dec 2018 • Dhruv Malik, Ashwin Pananjady, Kush Bhatia, Koulik Khamaru, Peter L. Bartlett, Martin J. Wainwright
We focus on characterizing the convergence rate of these methods when applied to linear-quadratic systems, and study various settings of driving noise and reward feedback.
no code implementations • ICML 2018 • Dhruv Malik, Malayandi Palaniappan, Jaime F. Fisac, Dylan Hadfield-Menell, Stuart Russell, Anca D. Dragan
We apply this update to a variety of POMDP solvers and find that it enables us to scale CIRL to non-trivial problems, with larger reward parameter spaces, and larger action spaces for both robot and human.
no code implementations • 20 Jul 2017 • Jaime F. Fisac, Monica A. Gates, Jessica B. Hamrick, Chang Liu, Dylan Hadfield-Menell, Malayandi Palaniappan, Dhruv Malik, S. Shankar Sastry, Thomas L. Griffiths, Anca D. Dragan
In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users' objectives as they go.