Search Results for author: Hengyuan Hu

Found 22 papers, 7 papers with code

“Other-Play” for Zero-Shot Coordination

no code implementations ICML 2020 Hengyuan Hu, Alexander Peysakhovich, Adam Lerer, Jakob Foerster

We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e. g. humans).

Multi-agent Reinforcement Learning Reinforcement Learning (RL)

Imitation Bootstrapped Reinforcement Learning

no code implementations3 Nov 2023 Hengyuan Hu, Suvir Mirchandani, Dorsa Sadigh

Despite the considerable potential of reinforcement learning (RL), robotic control tasks predominantly rely on imitation learning (IL) due to its better sample efficiency.

Continuous Control Imitation Learning +2

Toward Grounded Commonsense Reasoning

no code implementations14 Jun 2023 Minae Kwon, Hengyuan Hu, Vivek Myers, Siddharth Karamcheti, Anca Dragan, Dorsa Sadigh

We additionally illustrate our approach with a robot on 2 carefully designed surfaces.

Language Modelling

The Update-Equivalence Framework for Decision-Time Planning

no code implementations25 Apr 2023 Samuel Sokota, Gabriele Farina, David J. Wu, Hengyuan Hu, Kevin A. Wang, J. Zico Kolter, Noam Brown

Using this framework, we derive a provably sound search algorithm for fully cooperative games based on mirror descent and a search algorithm for adversarial games based on magnetic mirror descent.

Human-AI Coordination via Human-Regularized Search and Learning

no code implementations11 Oct 2022 Hengyuan Hu, David J Wu, Adam Lerer, Jakob Foerster, Noam Brown

First, we show that our method outperforms experts when playing with a group of diverse human players in ad-hoc teams.

K-level Reasoning for Zero-Shot Coordination in Hanabi

no code implementations NeurIPS 2021 Brandon Cui, Hengyuan Hu, Luis Pineda, Jakob N. Foerster

The standard problem setting in cooperative multi-agent settings is self-play (SP), where the goal is to train a team of agents that works well together.

Self-Explaining Deviations for Coordination

no code implementations13 Jul 2022 Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.

Modeling Strong and Human-Like Gameplay with KL-Regularized Search

no code implementations14 Dec 2021 Athul Paul Jacob, David J. Wu, Gabriele Farina, Adam Lerer, Hengyuan Hu, Anton Bakhtin, Jacob Andreas, Noam Brown

We consider the task of building strong but human-like policies in multi-agent decision-making problems, given examples of human behavior.

Imitation Learning

A Fine-Tuning Approach to Belief State Modeling

no code implementations ICLR 2022 Samuel Sokota, Hengyuan Hu, David J Wu, J Zico Kolter, Jakob Nicolaus Foerster, Noam Brown

Furthermore, because this specialization occurs after the action or policy has already been decided, BFT does not require the belief model to process it as input.

Learned Belief Search: Efficiently Improving Policies in Partially Observable Settings

no code implementations16 Jun 2021 Hengyuan Hu, Adam Lerer, Noam Brown, Jakob Foerster

Search is an important tool for computing effective policies in single- and multi-agent environments, and has been crucial for achieving superhuman performance in several benchmark fully and partially observable games.

counterfactual

Off-Belief Learning

5 code implementations6 Mar 2021 Hengyuan Hu, Adam Lerer, Brandon Cui, David Wu, Luis Pineda, Noam Brown, Jakob Foerster

Policies learned through self-play may adopt arbitrary conventions and implicitly rely on multi-step reasoning based on fragile assumptions about other agents' actions and thus fail when paired with humans or independently trained agents at test time.

Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian

no code implementations NeurIPS 2020 Jack Parker-Holder, Luke Metz, Cinjon Resnick, Hengyuan Hu, Adam Lerer, Alistair Letcher, Alex Peysakhovich, Aldo Pacchiano, Jakob Foerster

In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machine learning and are a key component of the success of deep neural networks (DNNs).

BIG-bench Machine Learning

"Other-Play" for Zero-Shot Coordination

2 code implementations6 Mar 2020 Hengyuan Hu, Adam Lerer, Alex Peysakhovich, Jakob Foerster

We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e. g. humans).

Multi-agent Reinforcement Learning

Improving Policies via Search in Cooperative Partially Observable Games

10 code implementations5 Dec 2019 Adam Lerer, Hengyuan Hu, Jakob Foerster, Noam Brown

The first one, single-agent search, effectively converts the problem into a single agent setting by making all but one of the agents play according to the agreed-upon policy.

Game of Hanabi

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning

4 code implementations ICLR 2020 Hengyuan Hu, Jakob N. Foerster

Learning to be informative when observed by others is an interesting challenge for Reinforcement Learning (RL): Fundamentally, RL requires agents to explore in order to discover good policies.

Multi-agent Reinforcement Learning reinforcement-learning +1

Hierarchical Decision Making by Generating and Following Natural Language Instructions

1 code implementation NeurIPS 2019 Hengyuan Hu, Denis Yarats, Qucheng Gong, Yuandong Tian, Mike Lewis

We explore using latent natural language instructions as an expressive and compositional representation of complex actions for hierarchical decision making.

Decision Making

Learning Deep Generative Models With Discrete Latent Variables

no code implementations ICLR 2018 Hengyuan Hu, Ruslan Salakhutdinov

There have been numerous recent advancements on learning deep generative models with latent variables thanks to the reparameterization trick that allows to train deep directed models effectively.

Density Estimation

Deep Restricted Boltzmann Networks

no code implementations15 Nov 2016 Hengyuan Hu, Lisheng Gao, Quanbin Ma

The most famous ones among them are deep belief network, which stacks multiple layer-wise pretrained RBMs to form a hybrid model, and deep Boltzmann machine, which allows connections between hidden units to form a multi-layer structure.

Cannot find the paper you are looking for? You can Submit a new open access paper.