Search Results for author: Ashutosh Nayyar

Found 17 papers, 1 papers with code

Model approximation in MDPs with unbounded per-step cost

no code implementations • 13 Feb 2024 • Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

How well does an optimal policy $\hat{\pi}^{\star}$ of the approximate model perform when used in the original model $\mathcal{M}$?

Paper
Add Code

Regret Analysis of the Posterior Sampling-based Learning Algorithm for Episodic POMDPs

no code implementations • 16 Oct 2023 • Dengwang Tang, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

We propose a Posterior Sampling-based reinforcement learning algorithm for POMDPs (PS4POMDPs), which is much simpler and more implementable compared to state-of-the-art optimism-based online learning algorithms for POMDPs.

Paper
Add Code

Conditional Kernel Imitation Learning for Continuous State Environments

no code implementations • 24 Aug 2023 • Rishabh Agrawal, Nathan Dahlin, Rahul Jain, Ashutosh Nayyar

Classical methods such as behavioral cloning and inverse reinforcement learning are highly sensitive to estimation errors, a problem that is particularly acute in continuous state space problems.

Density Estimation Imitation Learning +2

Paper
Add Code

Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes

no code implementations • 24 May 2023 • Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

In this paper, we first introduce an optimal control theory for partially observable Markov decision processes (POMDPs) with finite linear temporal logic constraints.

Paper
Add Code

A Novel Point-based Algorithm for Multi-agent Control Using the Common Information Approach

1 code implementation • 10 Apr 2023 • Dengwang Tang, Ashutosh Nayyar, Rahul Jain

The Common Information (CI) approach provides a systematic way to transform a multi-agent stochastic control problem to a single-agent partially observed Markov decision problem (POMDP) called the coordinator's POMDP.

Paper
Code

Optimal Communication and Control Strategies for a Multi-Agent System in the Presence of an Adversary

no code implementations • 8 Sep 2022 • Dhruva Kartik, Sagar Sudhakara, Rahul Jain, Ashutosh Nayyar

We consider a multi-agent system in which a decentralized team of agents controls a stochastic system in the presence of an adversary.

Paper
Add Code

Optimal Control of Partially Observable Markov Decision Processes with Finite Linear Temporal Logic Constraints

no code implementations • 17 Mar 2022 • Krishna C. Kalagarla, Dhruva Kartik, Dongming Shen, Rahul Jain, Ashutosh Nayyar, Pierluigi Nuzzo

Autonomous agents often operate in scenarios where the state is partially observed.

Paper
Add Code

A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent

no code implementations • 8 Sep 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar

In this paper, we propose Posterior Sampling Reinforcement Learning for Zero-sum Stochastic Games (PSRL-ZSG), the first online learning algorithm that achieves Bayesian regret bound of $O(HS\sqrt{AT})$ in the infinite-horizon zero-sum stochastic games with average-reward criterion.

Reinforcement Learning (RL)

Paper
Add Code

A relaxed technical assumption for posterior sampling-based reinforcement learning for control of unknown linear systems

no code implementations • 19 Aug 2021 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

The regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system.

Thompson Sampling

Paper
Add Code

Scalable regret for learning to control network-coupled subsystems with unknown dynamics

no code implementations • 18 Aug 2021 • Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network.

Thompson Sampling

Paper
Add Code

Online Learning for Unknown Partially Observable MDPs

no code implementations • 25 Feb 2021 • Mehdi Jafarnia-Jahromi, Rahul Jain, Ashutosh Nayyar

Learning optimal controllers for POMDPs when the model is unknown is harder.

Paper
Add Code

Common Information Belief based Dynamic Programs for Stochastic Zero-sum Games with Competing Teams

no code implementations • 11 Feb 2021 • Dhruva Kartik, Ashutosh Nayyar, Urbashi Mitra

For this general model, we provide bounds on the upper (min-max) and lower (max-min) values of the game.

Multiagent Systems Systems and Control Systems and Control

Paper
Add Code

Thompson sampling for linear quadratic mean-field teams

no code implementations • 9 Nov 2020 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

We consider optimal control of an unknown multi-agent linear quadratic (LQ) system where the dynamics and the cost are coupled across the agents through the mean-field (i. e., empirical mean) of the states and controls.

Thompson Sampling

Paper
Add Code

Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems

no code implementations • 27 Jan 2020 • Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar

This allows the agents to achieve a regret within $O(\sqrt{T})$ of the regret of the auxiliary single-agent problem.

Multi-agent Reinforcement Learning

Paper
Add Code

Sequential Experiment Design for Hypothesis Verification

no code implementations • 4 Dec 2018 • Dhruva Kartik, Ashutosh Nayyar, Urbashi Mitra

In the exploration phase, selection of experiments is such that a moderate level of confidence on the true hypothesis is achieved.

Two-sample testing

Paper
Add Code

Learning Unknown Markov Decision Processes: A Thompson Sampling Approach

no code implementations • NeurIPS 2017 • Yi Ouyang, Mukul Gagrani, Ashutosh Nayyar, Rahul Jain

This regret bound matches the best available bound for weakly communicating MDPs.

Thompson Sampling

Paper
Add Code

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

no code implementations • 8 Sep 2012 • Ashutosh Nayyar, Aditya Mahajan, Demosthenis Teneketzis

A general model of decentralized stochastic control called partial history sharing information structure is presented.

Systems and Control Optimization and Control

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.