Search Results for author: Jianhao Wang

Found 18 papers, 11 papers with code

Latent-Variable Advantage-Weighted Policy Optimization for Offline RL

1 code implementation16 Mar 2022 Xi Chen, Ali Ghadirzadeh, Tianhe Yu, Yuan Gao, Jianhao Wang, Wenzhe Li, Bin Liang, Chelsea Finn, Chongjie Zhang

Offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions.

Continuous Control Offline RL +2

Self-Organized Polynomial-Time Coordination Graphs

1 code implementation7 Dec 2021 Qianlan Yang, Weijun Dong, Zhizhou Ren, Jianhao Wang, Tonghan Wang, Chongjie Zhang

However, one critical challenge in this paradigm is the complexity of greedy action selection with respect to the factorized values.

Computational Efficiency Multi-agent Reinforcement Learning

Offline Reinforcement Learning with Reverse Model-based Imagination

1 code implementation NeurIPS 2021 Jianhao Wang, Wenzhe Li, Haozhe Jiang, Guangxiang Zhu, Siyuan Li, Chongjie Zhang

These reverse imaginations provide informed data augmentation for model-free policy learning and enable conservative generalization beyond the offline dataset.

Data Augmentation Offline RL +2

Active Hierarchical Exploration with Stable Subgoal Representation Learning

1 code implementation ICLR 2022 Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang

Although GCHRL possesses superior exploration ability by decomposing tasks via subgoals, existing GCHRL methods struggle in temporally extended tasks with sparse external rewards, since the high-level policy learning relies on external rewards.

Continuous Control Hierarchical Reinforcement Learning +1

Learning Subgoal Representations with Slow Dynamics

no code implementations ICLR 2021 Siyuan Li, Lulu Zheng, Jianhao Wang, Chongjie Zhang

In goal-conditioned Hierarchical Reinforcement Learning (HRL), a high-level policy periodically sets subgoals for a low-level policy, and the low-level policy is trained to reach those subgoals.

Continuous Control Hierarchical Reinforcement Learning +1

QPLEX: Duplex Dueling Multi-Agent Q-Learning

5 code implementations ICLR 2021 Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang

This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function.

Decision Making Multi-agent Reinforcement Learning +3

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

no code implementations NeurIPS 2021 Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang

Value factorization is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings, which balances the learning scalability and the representational capacity of value functions.

counterfactual Multi-agent Reinforcement Learning +3

Influence-Based Multi-Agent Exploration

1 code implementation ICLR 2020 Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang

We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents.

reinforcement-learning Reinforcement Learning (RL)

Learning Nearly Decomposable Value Functions Via Communication Minimization

1 code implementation ICLR 2020 Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Recently, value function factorization learning emerges as a promising way to address these challenges in collaborative multi-agent systems.

Starcraft

Object-Oriented Model Learning through Multi-Level Abstraction

no code implementations ICLR 2019 Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Chongjie Zhang

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability.

Object Relational Reasoning +1

Object-Oriented Dynamics Learning through Multi-Level Abstraction

1 code implementation16 Apr 2019 Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Zichuan Lin, Chongjie Zhang

We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability.

Object Relational Reasoning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.