no code implementations • 19 Feb 2024 • Yang Li, WenHao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan
The visualization of learning dynamics effectively demonstrates that AgA successfully achieves alignment between individual and collective objectives.
no code implementations • 19 Feb 2024 • Zhixun Chen, Yali Du, David Mguni
Theoretically, we prove LONDI learns the subset of system states to activate the LLM required to solve the task.
no code implementations • 11 Feb 2024 • Xidong Feng, Ziyu Wan, Mengyue Yang, Ziyan Wang, Girish A. Koushik, Yali Du, Ying Wen, Jun Wang
Reinforcement Learning (RL) has shown remarkable abilities in learning policies for decision-making tasks.
no code implementations • 10 Feb 2024 • Nam Phuong Tran, The Anh Ta, Shuqing Shi, Debmalya Mandal, Yali Du, Long Tran-Thanh
Reward allocation, also known as the credit assignment problem, has been an important topic in economics, engineering, and machine learning.
no code implementations • 15 Jan 2024 • Xingzhou Lou, Junge Zhang, Ziyan Wang, Kaiqi Huang, Yali Du
Through the use of pre-trained LMs and the elimination of the need for a ground-truth cost, our method enhances safe policy learning under a diverse set of human-derived free-form natural language constraints.
no code implementations • 29 Dec 2023 • Zijing Shi, Meng Fang, Shunfeng Zheng, Shilong Deng, Ling Chen, Yali Du
This problem motivates the area of ad hoc teamwork, in which an agent may potentially cooperate with a variety of teammates to achieve a shared goal.
1 code implementation • 25 Dec 2023 • Xingzhou Lou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, Yali Du
We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods.
no code implementations • 8 Dec 2023 • Yali Du, Joel Z. Leibo, Usman Islam, Richard Willis, Peter Sunehag
Cooperation in multi-agent learning (MAL) is a topic at the intersection of numerous disciplines, including game theory, economics, social sciences, and evolutionary biology.
no code implementations • 6 Dec 2023 • Ziyan Wang, Yali Du, Yudi Zhang, Meng Fang, Biwei Huang
Offline Multi-agent Reinforcement Learning (MARL) is valuable in scenarios where online interaction is impractical or risky.
1 code implementation • NeurIPS 2023 • Shutong Ding, Jingya Wang, Yali Du, Ye Shi
To the best of our knowledge, RPO is the first attempt that introduces GRG to RL as a way of efficiently handling both equality and inequality hard constraints.
1 code implementation • NeurIPS 2023 • Mengyue Yang, Zhen Fang, Yonggang Zhang, Yali Du, Furui Liu, Jean-Francois Ton, Jianhong Wang, Jun Wang
To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause.
no code implementations • 5 Aug 2023 • Jiarui Jin, Xianyu Chen, Weinan Zhang, Mengyue Yang, Yang Wang, Yali Du, Yong Yu, Jun Wang
Notice that these ranking metrics do not consider the effects of the contextual dependence among the items in the list, we design a new family of simulation-based ranking metrics, where existing metrics can be regarded as special cases.
1 code implementation • NeurIPS 2023 • Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, Jun Wang
Thus, we propose ChessGPT, a GPT model bridging policy learning and language modeling by integrating data from these two sources in Chess games.
no code implementations • 6 Jun 2023 • Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li
In this paper, we propose a novel zero-shot preference-based RL algorithm that leverages labeled preference data from source tasks to infer labels for target tasks, eliminating the requirement for human queries.
1 code implementation • 5 Jun 2023 • Yang Li, Shao Zhang, Jichen Sun, WenHao Zhang, Yali Du, Ying Wen, Xinbing Wang, Wei Pan
In order to solve cooperative incompatibility in learning and effectively address the problem in the context of ZSC, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
no code implementations • NeurIPS 2023 • Yudi Zhang, Yali Du, Biwei Huang, Ziyan Wang, Jun Wang, Meng Fang, Mykola Pechenizkiy
While the majority of current approaches construct the reward redistribution in an uninterpretable manner, we propose to explicitly model the contributions of state and action from a causal perspective, resulting in an interpretable reward redistribution and preserving policy invariance.
no code implementations • 19 May 2023 • Liting Chen, Lu Wang, Hang Dong, Yali Du, Jie Yan, Fangkai Yang, Shuang Li, Pu Zhao, Si Qin, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang
The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks.
1 code implementation • 15 Apr 2023 • Sirui Chen, Zhaowei Zhang, Yaodong Yang, Yali Du
It first decomposes the global return back to each time step, then utilizes the Shapley Value to redistribute the individual payoff from the decomposed global reward.
no code implementations • 25 Feb 2023 • Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Jan Peters, Alois Knoll
Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment.
1 code implementation • 9 Feb 2023 • Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei Pan
However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility.
no code implementations • 7 Feb 2023 • Lukas Schäfer, Oliver Slumbers, Stephen Mcaleer, Yali Du, Stefano V. Albrecht, David Mguni
In this work, we propose ensemble value functions for multi-agent exploration (EMAX), a general framework to seamlessly extend value-based MARL algorithms with ensembles of value functions.
1 code implementation • 16 Jan 2023 • Xingzhou Lou, Jiaxian Guo, Junge Zhang, Jun Wang, Kaiqi Huang, Yali Du
We conduct experiments on the Overcooked environment, and evaluate the zero-shot human-AI coordination performance of our method with both behavior-cloned human proxies and real humans.
1 code implementation • 22 Dec 2022 • Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, Liqiang Nie
The booming development and huge market of micro-videos bring new e-commerce channels for merchants.
no code implementations • 15 Nov 2022 • Runji Lin, Ye Li, Xidong Feng, Zhaowei Zhang, Xian Hong Wu Fung, Haifeng Zhang, Jun Wang, Yali Du, Yaodong Yang
Firstly, we propose prompt tuning for offline RL, where a context vector sequence is concatenated with the input to guide the conditional policy generation.
2 code implementations • 13 Jul 2022 • Yali Du, Chengdong Ma, Yuchen Liu, Runji Lin, Hao Dong, Jun Wang, Yaodong Yang
Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks.
1 code implementation • 20 May 2022 • Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll
To establish a good foundation for future research in this thread, in this paper, we provide a review for safe RL from the perspectives of methods, theory and applications.
1 code implementation • ACL 2022 • Yunqiu Xu, Meng Fang, Ling Chen, Yali Du, Joey Tianyi Zhou, Chengqi Zhang
Text-based games provide an interactive way to study natural language processing.
1 code implementation • ICLR 2022 • Rui Yang, Yiming Lu, Wenzhe Li, Hao Sun, Meng Fang, Yali Du, Xiu Li, Lei Han, Chongjie Zhang
In this paper, we revisit the theoretical property of GCSL -- optimizing a lower bound of the goal reaching objective, and extend GCSL as a novel offline goal-conditioned RL algorithm.
1 code implementation • 12 Jan 2022 • Xue Yan, Yali Du, Binxin Ru, Jun Wang, Haifeng Zhang, Xu Chen
The Elo rating system is widely adopted to evaluate the skills of (chess) game and sports players.
no code implementations • 29 Sep 2021 • Meng Fang, Yunqiu Xu, Yali Du, Ling Chen, Chengqi Zhang
In a variety of text-based games, we show that this simple method results in competitive performance for agents.
no code implementations • 29 Sep 2021 • Yuchen Liu, Yali Du, Runji Lin, Hangrui Bi, Mingdong Wu, Jun Wang, Hao Dong
Model-based RL is an effective approach for reducing sample complexity.
Model-based Reinforcement Learning Reinforcement Learning (RL)
1 code implementation • Findings (EMNLP) 2021 • Yunqiu Xu, Meng Fang, Ling Chen, Yali Du, Chengqi Zhang
Deep reinforcement learning provides a promising approach for text-based games in studying natural language communication between humans and artificial agents.
Hierarchical Reinforcement Learning reinforcement-learning +2
no code implementations • 17 Aug 2021 • Zhijian Duan, Wenhan Huang, Dinghuai Zhang, Yali Du, Jun Wang, Yaodong Yang, Xiaotie Deng
In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated from a distribution.
no code implementations • 1 Jul 2021 • Rui Yang, Meng Fang, Lei Han, Yali Du, Feng Luo, Xiu Li
Replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method, model-based relabeling (MBR).
1 code implementation • 14 May 2021 • Xiaoqiang Wang, Yali Du, Shengyu Zhu, Liangjun Ke, Zhitang Chen, Jianye Hao, Jun Wang
It is a long-standing question to discover causal relations among a set of variables in many empirical sciences.
no code implementations • 1 Jan 2021 • Yali Du, Yifan Zhao, Meng Fang, Jun Wang, Gangyan Xu, Haifeng Zhang
Dealing with multi-agent control in networked systems is one of the biggest challenges in Reinforcement Learning (RL) and limited success has been presented compared to recent deep reinforcement learning in single-agent domain.
1 code implementation • NeurIPS 2020 • Yunqiu Xu, Meng Fang, Ling Chen, Yali Du, Joey Tianyi Zhou, Chengqi Zhang
We study reinforcement learning (RL) for text-based games, which are interactive simulations in the context of natural language.
1 code implementation • NeurIPS 2019 • Yali Du, Lei Han, Meng Fang, Ji Liu, Tianhong Dai, DaCheng Tao
A great challenge in cooperative decentralized multi-agent reinforcement learning (MARL) is generating diversified behaviors for each individual agent when receiving only a team reward.
Multi-agent Reinforcement Learning reinforcement-learning +3
1 code implementation • NeurIPS 2019 • Meng Fang, Tianyi Zhou, Yali Du, Lei Han, Zhengyou Zhang
This ``Goal-and-Curiosity-driven Curriculum Learning'' leads to ``Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection.
no code implementations • 10 Sep 2019 • Liheng Chen, Hongyi Guo, Yali Du, Fei Fang, Haifeng Zhang, Yaoming Zhu, Ming Zhou, Wei-Nan Zhang, Qing Wang, Yong Yu
Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.
Multi-agent Reinforcement Learning reinforcement-learning +1
1 code implementation • 9 Sep 2018 • Yali Du, Meng Fang, Jin-Feng Yi, Jun Cheng, DaCheng Tao
First, we initialize an adversarial example with a gray color image on which every pixel has roughly the same importance for the target model.