Search Results for author: Udari Madhushani

Found 14 papers, 1 papers with code

O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models

no code implementations • 22 Oct 2023 • Yuchen Xiao, Yanchao Sun, Mengda Xu, Udari Madhushani, Jared Vann, Deepeka Garg, Sumitra Ganesh

Recent advancements in large language models (LLMs) have exhibited promising performance in solving sequential decision-making problems.

Decision Making In-Context Learning

Paper
Add Code

Heterogeneous Social Value Orientation Leads to Meaningful Diversity in Sequential Social Dilemmas

no code implementations • 1 May 2023 • Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán

In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others.

Zero-shot Generalization

Paper
Add Code

Melting Pot 2.0

2 code implementations • 24 Nov 2022 • John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.

Artificial Life Navigate

541

Paper
Code

A Regret Minimization Approach to Multi-Agent Control

no code implementations • 28 Jan 2022 • Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan

We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances.

Paper
Add Code

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

no code implementations • NeurIPS 2021 • Udari Madhushani, Abhimanyu Dubey, Naomi Ehrich Leonard, Alex Pentland

However, most research for this problem focuses exclusively on the setting with perfect communication, whereas in most real-world distributed settings, communication is often over stochastic networks, with arbitrary corruptions and delays.

Decision Making

Paper
Add Code

Provably Efficient Multi-Agent Reinforcement Learning with Fully Decentralized Communication

no code implementations • 14 Oct 2021 • Justin Lidard, Udari Madhushani, Naomi Ehrich Leonard

Distributed exploration reduces sampling complexity in multi-agent RL (MARL).

Multi-agent Reinforcement Learning Q-Learning +2

Paper
Add Code

When to Call Your Neighbor? Strategic Communication in Cooperative Stochastic Bandits

no code implementations • 8 Oct 2021 • Udari Madhushani, Naomi Leonard

We propose \textit{ComEx}, a novel cost-effective communication protocol in which the group achieves the same order of performance as full communication while communicating only $O(\log T)$ number of messages.

Decision Making

Paper
Add Code

Distributed Bandits: Probabilistic Communication on $d$-regular Graphs

no code implementations • 16 Nov 2020 • Udari Madhushani, Naomi Ehrich Leonard

Every edge in the graph has probabilistic weight $p$ to account for the ($1\!-\! p$) probability of a communication link failure.

Paper
Add Code

On Using Hamiltonian Monte Carlo Sampling for Reinforcement Learning Problems in High-dimension

no code implementations • 11 Nov 2020 • Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty

Value function based reinforcement learning (RL) algorithms, for example, $Q$-learning, learn optimal policies from datasets of actions, rewards, and state transitions.

Matrix Completion Q-Learning +2

Paper
Add Code

It Doesn’t Get Better and Here’s Why: A Fundamental Drawback in Natural Extensions of UCB to Multi-agent Bandits

no code implementations • NeurIPS Workshop ICBINB 2020 • Udari Madhushani, Naomi Leonard

We identify a fundamental drawback of natural extensions of Upper Confidence Bound (UCB) algorithms to the multi-agent bandit problem in which multiple agents facing the same explore-exploit problem can share information.

Paper
Add Code

Heterogeneous Explore-Exploit Strategies on Multi-Star Networks

no code implementations • 2 Sep 2020 • Udari Madhushani, Naomi Leonard

To do so we study a class of distributed stochastic bandit problems in which agents communicate over a multi-star network and make sequential choices among options in the same uncertain environment.

Decision Making

Paper
Add Code

Distributed Learning: Sequential Decision Making in Resource-Constrained Environments

no code implementations • 13 Apr 2020 • Udari Madhushani, Naomi Ehrich Leonard

We study cost-effective communication strategies that can be used to improve the performance of distributed learning systems in resource-constrained environments.

Decision Making

Paper
Add Code

A Dynamic Observation Strategy for Multi-agent Multi-armed Bandit Problem

no code implementations • 8 Apr 2020 • Udari Madhushani, Naomi Ehrich Leonard

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors under a linear observation cost.

Decision Making

Paper
Add Code

Heterogeneous Stochastic Interactions for Multiple Agents in a Multi-armed Bandit Problem

no code implementations • 21 May 2019 • Udari Madhushani, Naomi Ehrich Leonard

We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.