Search Results for author: Yujing Hu

Found 23 papers, 7 papers with code

XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning Techniques

no code implementations • 20 Feb 2024 • Yu Xiong, Zhipeng Hu, Ye Huang, Runze Wu, Kai Guan, Xingchen Fang, Ji Jiang, Tianze Zhou, Yujing Hu, Haoyu Liu, Tangjie Lyu, Changjie Fan

To address this, we introduce XRL-Bench, a unified standardized benchmark tailored for the evaluation and comparison of XRL methods, encompassing three main modules: standard RL environments, explainers based on state importance, and standard evaluators.

Decision Making Reinforcement Learning (RL)

Paper
Add Code

AlignDiff: Aligning Diverse Human Preferences via Behavior-Customisable Diffusion Model

no code implementations • 3 Oct 2023 • Zibin Dong, Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Tangjie Lv, Changjie Fan, Zhipeng Hu

Aligning agent behaviors with diverse human preferences remains a challenging problem in reinforcement learning (RL), owing to the inherent abstractness and mutability of human preferences.

Attribute Reinforcement Learning (RL)

Paper
Add Code

Prioritized Trajectory Replay: A Replay Memory for Data-driven Reinforcement Learning

no code implementations • 27 Jun 2023 • Jinyi Liu, Yi Ma, Jianye Hao, Yujing Hu, Yan Zheng, Tangjie Lv, Changjie Fan

In summary, our research emphasizes the significance of trajectory-based data sampling techniques in enhancing the efficiency and performance of offline RL algorithms.

D4RL Offline RL +2

Paper
Add Code

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

no code implementations • 23 Mar 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah

To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022.

Paper
Add Code

Adaptive Value Decomposition with Greedy Marginal Contribution Computation for Cooperative Multi-Agent Reinforcement Learning

1 code implementation • 14 Feb 2023 • Shanqi Liu, Yujing Hu, Runze Wu, Dong Xing, Yu Xiong, Changjie Fan, Kun Kuang, Yong liu

We first illustrate that the proposed value decomposition can consider the complicated interactions among agents and is feasible to learn in large-scale scenarios.

Multi-agent Reinforcement Learning

Paper
Code

Towards Skilled Population Curriculum for Multi-Agent Reinforcement Learning

no code implementations • 7 Feb 2023 • Rundong Wang, Longtao Zheng, Wei Qiu, Bowei He, Bo An, Zinovi Rabinovich, Yujing Hu, Yingfeng Chen, Tangjie Lv, Changjie Fan

Despite its success, ACL's applicability is limited by (1) the lack of a general student framework for dealing with the varying number of agents across tasks and the sparse reward problem, and (2) the non-stationarity of the teacher's task due to ever-changing student strategies.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Neural Episodic Control with State Abstraction

no code implementations • 27 Jan 2023 • Zhuo Li, Derui Zhu, Yujing Hu, Xiaofei Xie, Lei Ma, Yan Zheng, Yan Song, Yingfeng Chen, Jianjun Zhao

Generally, episodic control-based approaches are solutions that leverage highly-rewarded past experiences to improve sample efficiency of DRL algorithms.

OpenAI Gym

Paper
Add Code

EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model

no code implementations • 2 Oct 2022 • Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Jinyi Liu, Yingfeng Chen, Changjie Fan

Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment without the guidance of extrinsic rewards to facilitate the fast adaptation of various downstream tasks.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Add Code

Automatic Reward Design via Learning Motivation-Consistent Intrinsic Rewards

no code implementations • 29 Jul 2022 • Yixiang Wang, Yujing Hu, Feng Wu, Yingfeng Chen

In this paper, we propose to automatically generate goal-consistent intrinsic rewards for the agent to learn, by maximizing which the expected accumulative extrinsic rewards can be maximized.

Paper
Add Code

Off-Beat Multi-Agent Reinforcement Learning

no code implementations • 27 May 2022 • Wei Qiu, Weixun Wang, Rundong Wang, Bo An, Yujing Hu, Svetlana Obraztsova, Zinovi Rabinovich, Jianye Hao, Yingfeng Chen, Changjie Fan

During execution durations, the environment changes are influenced by, but not synchronised with, action execution.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Add Code

Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

1 code implementation • NeurIPS 2021 • Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu

With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning.

Paper
Code

Episodic Multi-agent Reinforcement Learning with Curiosity-Driven Exploration

2 code implementations • NeurIPS 2021 • Lulu Zheng, Jiarui Chen, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Efficient exploration in deep cooperative multi-agent reinforcement learning (MARL) still remains challenging in complex coordination problems.

Efficient Exploration Multi-agent Reinforcement Learning +4

Paper
Code

Learning the Representation of Behavior Styles with Imitation Learning

no code implementations • 29 Sep 2021 • Xiao Liu, Meng Wang, Zhaorong Wang, Yingfeng Chen, Yujing Hu, Changjie Fan, Chongjie Zhang

Imitation learning is one of the methods for reproducing expert demonstrations adaptively by learning a mapping between observations and actions.

Imitation Learning

Paper
Add Code

Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

no code implementations • 9 Jun 2021 • Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu

With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning.

Paper
Add Code

Fever Basketball: A Complex, Flexible, and Asynchronized Sports Game Environment for Multi-agent Reinforcement Learning

no code implementations • 6 Dec 2020 • Hangtian Jia, Yujing Hu, Yingfeng Chen, Chunxu Ren, Tangjie Lv, Changjie Fan, Chongjie Zhang

We introduce the Fever Basketball game, a novel reinforcement learning environment where agents are trained to play basketball game.

Board Games Multi-agent Reinforcement Learning +2

Paper
Add Code

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

no code implementations • NeurIPS 2020 • Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

In this paper, we consider the problem of adaptively utilizing a given shaping reward function.

Reinforcement Learning (RL)

Paper
Add Code

Transfer among Agents: An Efficient Multiagent Transfer Learning Framework

no code implementations • 28 Sep 2020 • Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan

In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.

Open-Ended Question Answering Reinforcement Learning (RL) +1

Paper
Add Code

Exploring Unknown States with Action Balance

2 code implementations • 10 Mar 2020 • Yan Song, Yingfeng Chen, Yujing Hu, Changjie Fan

In this paper, we focus on improving the effectiveness of finding unknown states and propose action balance exploration, which balances the frequency of selecting each action at a given state and can be treated as an extension of upper confidence bound (UCB) to deep reinforcement learning.

Montezuma's Revenge reinforcement-learning +1

Paper
Code

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

1 code implementation • 19 Feb 2020 • Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Multi-Agent Game Abstraction via Graph Attention Neural Network

no code implementations • 25 Nov 2019 • Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, Yang Gao

Traditional methods attempt to use pre-defined rules to capture the interaction relationship between agents.

Graph Attention Multi-agent Reinforcement Learning

Paper
Add Code

From Few to More: Large-scale Dynamic Multiagent Curriculum Learning

no code implementations • 6 Sep 2019 • Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao

In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents.

Paper
Add Code

Action Semantics Network: Considering the Effects of Actions in Multiagent Systems

1 code implementation • ICLR 2020 • Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao

ASN characterizes different actions' influence on other agents using neural networks based on the action semantics between them.

Starcraft Starcraft II

Paper
Code

Reinforcement Learning to Rank in E-Commerce Search Engine: Formalization, Analysis, and Application

1 code implementation • 2 Mar 2018 • Yujing Hu, Qing Da, An-Xiang Zeng, Yang Yu, Yinghui Xu

For better utilizing the correlation between different ranking steps, in this paper, we propose to use reinforcement learning (RL) to learn an optimal ranking policy which maximizes the expected accumulative rewards in a search session.

Decision Making Learning-To-Rank +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.