no code implementations • 13 Feb 2024 • Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
How well does an optimal policy $\hat{\pi}^{\star}$ of the approximate model perform when used in the original model $\mathcal{M}$?
no code implementations • 16 Oct 2023 • Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo
We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms for POMDPs.
no code implementations • 24 Aug 2023 • Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar
Classical methods such as behavioral cloning and inverse reinforcement learning are highly sensitive to estimation errors, a problem that is particularly acute in continuous state space problems.
no code implementations • 24 May 2023 • Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo
In this paper, we first introduce an optimal control theory for partially observable Markov decision processes (POMDPs) with finite linear temporal logic constraints.
1 code implementation • 10 Apr 2023 • Dengwang Tang, Ashutosh Nayyar, Rahul Jain
The Common Information (CI) approach provides a systematic way to transform a multi-agent stochastic control problem to a single-agent partially observed Markov decision problem (POMDP) called the coordinator's POMDP.
no code implementations • 8 Sep 2022 • Dhruva Kartik, Sagar Sudhakara, Rahul Jain, Ashutosh Nayyar
We consider a multi-agent system in which a decentralized team of agents controls a stochastic system in the presence of an adversary.
no code implementations • 17 Mar 2022 • Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo
Autonomous agents often operate in scenarios where the state is partially observed.
no code implementations • 8 Sep 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar
In this paper, we propose Posterior Sampling Reinforcement Learning for Zero-sum Stochastic Games (PSRL-ZSG), the first online learning algorithm that achieves Bayesian regret bound of $O(HS\sqrt{AT})$ in the infinite-horizon zero-sum stochastic games with average-reward criterion.
no code implementations • 19 Aug 2021 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
The regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system.
no code implementations • 18 Aug 2021 • Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network.
no code implementations • 25 Feb 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar
Learning optimal controllers for POMDPs when the model is unknown is harder.
no code implementations • 11 Feb 2021 • Dhruva Kartik, Ashutosh Nayyar, Urbashi Mitra
For this general model, we provide bounds on the upper (min-max) and lower (max-min) values of the game.
Multiagent Systems Systems and Control Systems and Control
no code implementations • 9 Nov 2020 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider optimal control of an unknown multi-agent linear quadratic (LQ) system where the dynamics and the cost are coupled across the agents through the mean-field (i. e., empirical mean) of the states and controls.
no code implementations • 27 Jan 2020 • Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar
This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem.
no code implementations • 4 Dec 2018 • Dhruva Kartik, Ashutosh Nayyar, Urbashi Mitra
In the exploration phase, selection of experiments is such that a moderate level of confidence on the true hypothesis is achieved.
no code implementations • NeurIPS 2017 • Yi Ouyang, Mukul Gagrani, Ashutosh Nayyar, Rahul Jain
This regret bound matches the best available bound for weakly communicating MDPs.
no code implementations • 8 Sep 2012 • Ashutosh Nayyar, Aditya Mahajan, Demosthenis Teneketzis
A general model of decentralized stochastic control called partial history sharing information structure is presented.
Systems and Control Optimization and Control