no code implementations • 22 Oct 2023 • Yuchen Xiao, Yanchao Sun, Mengda Xu, Udari Madhushani, Jared Vann, Deepeka Garg, Sumitra Ganesh
Recent advancements in large language models (LLMs) have exhibited promising performance in solving sequential decision-making problems.
no code implementations • 1 May 2023 • Udari Madhushani, Kevin R. McKee, John P. Agapiou, Joel Z. Leibo, Richard Everett, Thomas Anthony, Edward Hughes, Karl Tuyls, Edgar A. Duéñez-Guzmán
In social psychology, Social Value Orientation (SVO) describes an individual's propensity to allocate resources between themself and others.
2 code implementations • 24 Nov 2022 • John P. Agapiou, Alexander Sasha Vezhnevets, Edgar A. Duéñez-Guzmán, Jayd Matyas, Yiran Mao, Peter Sunehag, Raphael Köster, Udari Madhushani, Kavya Kopparapu, Ramona Comanescu, DJ Strouse, Michael B. Johanson, Sukhdeep Singh, Julia Haas, Igor Mordatch, Dean Mobbs, Joel Z. Leibo
Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence, and provides an evaluation protocol that measures generalization to novel social partners in a set of canonical test scenarios.
no code implementations • 28 Jan 2022 • Udaya Ghai, Udari Madhushani, Naomi Leonard, Elad Hazan
We study the problem of multi-agent control of a dynamical system with known dynamics and adversarial disturbances.
no code implementations • NeurIPS 2021 • Udari Madhushani, Abhimanyu Dubey, Naomi Ehrich Leonard, Alex Pentland
However, most research for this problem focuses exclusively on the setting with perfect communication, whereas in most real-world distributed settings, communication is often over stochastic networks, with arbitrary corruptions and delays.
no code implementations • 14 Oct 2021 • Justin Lidard, Udari Madhushani, Naomi Ehrich Leonard
Distributed exploration reduces sampling complexity in multi-agent RL (MARL).
no code implementations • 8 Oct 2021 • Udari Madhushani, Naomi Leonard
We propose \textit{ComEx}, a novel cost-effective communication protocol in which the group achieves the same order of performance as full communication while communicating only $O(\log T)$ number of messages.
no code implementations • 16 Nov 2020 • Udari Madhushani, Naomi Ehrich Leonard
Every edge in the graph has probabilistic weight $p$ to account for the ($1\!-\! p$) probability of a communication link failure.
no code implementations • 11 Nov 2020 • Udari Madhushani, Biswadip Dey, Naomi Ehrich Leonard, Amit Chakraborty
Value function based reinforcement learning (RL) algorithms, for example, $Q$-learning, learn optimal policies from datasets of actions, rewards, and state transitions.
no code implementations • NeurIPS Workshop ICBINB 2020 • Udari Madhushani, Naomi Leonard
We identify a fundamental drawback of natural extensions of Upper Confidence Bound (UCB) algorithms to the multi-agent bandit problem in which multiple agents facing the same explore-exploit problem can share information.
no code implementations • 2 Sep 2020 • Udari Madhushani, Naomi Leonard
To do so we study a class of distributed stochastic bandit problems in which agents communicate over a multi-star network and make sequential choices among options in the same uncertain environment.
no code implementations • 13 Apr 2020 • Udari Madhushani, Naomi Ehrich Leonard
We study cost-effective communication strategies that can be used to improve the performance of distributed learning systems in resource-constrained environments.
no code implementations • 8 Apr 2020 • Udari Madhushani, Naomi Ehrich Leonard
We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors under a linear observation cost.
no code implementations • 21 May 2019 • Udari Madhushani, Naomi Ehrich Leonard
We define and analyze a multi-agent multi-armed bandit problem in which decision-making agents can observe the choices and rewards of their neighbors.