Search Results for author: Zhizhou Ren

Found 12 papers, 8 papers with code

Efficient Meta Reinforcement Learning for Preference-based Fast Adaptation

1 code implementation • 20 Nov 2022 • Zhizhou Ren, Anji Liu, Yitao Liang, Jian Peng, Jianzhu Ma

To bridge this gap, we study the problem of few-shot adaptation in the context of human-in-the-loop reinforcement learning.

Meta Reinforcement Learning reinforcement-learning +1

Paper
Code

Self-Organized Polynomial-Time Coordination Graphs

1 code implementation • 7 Dec 2021 • Qianlan Yang, Weijun Dong, Zhizhou Ren, Jianhao Wang, Tonghan Wang, Chongjie Zhang

However, one critical challenge in this paradigm is the complexity of greedy action selection with respect to the factorized values.

Computational Efficiency Multi-agent Reinforcement Learning

Paper
Code

Learning Long-Term Reward Redistribution via Randomized Return Decomposition

1 code implementation • ICLR 2022 • Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng

Based on this framework, this paper proposes a novel reward redistribution algorithm, randomized return decomposition (RRD), to learn a proxy reward function for episodic reinforcement learning.

Attribute reinforcement-learning +1

Paper
Code

On the Estimation Bias in Double Q-Learning

1 code implementation • NeurIPS 2021 • Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang

Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation.

Q-Learning Value prediction

Paper
Code

Off-Policy Reinforcement Learning with Delayed Rewards

no code implementations • 22 Jun 2021 • Beining Han, Zhizhou Ren, Zuofan Wu, Yuan Zhou, Jian Peng

We study deep reinforcement learning (RL) algorithms with delayed rewards.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Generalizable Episodic Memory for Deep Reinforcement Learning

1 code implementation • 11 Mar 2021 • Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang

Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning.

Atari Games Continuous Control +2

Paper
Code

Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning

no code implementations • 28 Sep 2020 • Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang

Value decomposition is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings.

counterfactual Multi-agent Reinforcement Learning +3

Paper
Add Code

QPLEX: Duplex Dueling Multi-Agent Q-Learning

5 code implementations • ICLR 2021 • Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang

This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function.

Decision Making Multi-agent Reinforcement Learning +3

113

Paper
Code

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

no code implementations • NeurIPS 2021 • Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang

Value factorization is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings, which balances the learning scalability and the representational capacity of value functions.

counterfactual Multi-agent Reinforcement Learning +3

Paper
Add Code

Exploration via Hindsight Goal Generation

1 code implementation • NeurIPS 2019 • Zhizhou Ren, Kefan Dong, Yuan Zhou, Qiang Liu, Jian Peng

Goal-oriented reinforcement learning has recently been a practical framework for robotic manipulation tasks, in which an agent is required to reach a certain goal defined by a function on the state space.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Object-Oriented Model Learning through Multi-Level Abstraction

no code implementations • ICLR 2019 • Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Chongjie Zhang

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability.

Object Relational Reasoning +1

Paper
Add Code

Object-Oriented Dynamics Learning through Multi-Level Abstraction

1 code implementation • 16 Apr 2019 • Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Zichuan Lin, Chongjie Zhang

We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability.

Object Relational Reasoning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.