1 code implementation • 18 Apr 2024 • Haoyuan Jiang, Ziyue Li, Hua Wei, Xuantang Xiong, Jingqing Ruan, Jiaming Lu, Hangyu Mao, Rui Zhao
The effectiveness of traffic light control has been significantly improved by current reinforcement learning-based approaches via better cooperation among multiple traffic lights.
1 code implementation • 13 Mar 2024 • Zhishuai Li, Xiang Wang, Jingjing Zhao, Sun Yang, Guoqing Du, Xiaoru Hu, Bin Zhang, Yuxiao Ye, Ziyue Li, Rui Zhao, Hangyu Mao
Then, in the first stage, question-SQL pairs are retrieved as few-shot demonstrations, prompting the LLM to generate a preliminary SQL (PreSQL).
Ranked #1 on Text-To-SQL on spider
no code implementations • 5 Mar 2024 • Bin Zhang, Yuxiao Ye, Guoqing Du, Xiaoru Hu, Zhishuai Li, Sun Yang, Chi Harold Liu, Rui Zhao, Ziyue Li, Hangyu Mao
Then we formulate five evaluation tasks to comprehensively assess the performance of diverse methods across various LLMs throughout the Text-to-SQL process. Our study highlights the performance disparities among LLMs and proposes optimal in-context learning solutions tailored to each task.
2 code implementations • 26 Dec 2023 • Hangyu Mao, Rui Zhao, Ziyue Li, Zhiwei Xu, Hao Chen, Yiqun Chen, Bin Zhang, Zhen Xiao, Junge Zhang, Jiangjin Yin
Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL.
1 code implementation • 22 Dec 2023 • Jiaming Lu, Jingqing Ruan, Haoyuan Jiang, Ziyue Li, Hangyu Mao, Rui Zhao
Furthermore, we implement a scenario-shared Co-Train module to facilitate the learning of generalizable dynamics information across different scenarios.
no code implementations • 23 Nov 2023 • Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, Guoliang Fan
The remarkable progress in Large Language Models (LLMs) opens up new avenues for addressing planning and decision-making problems in Multi-Agent Systems (MAS).
no code implementations • 19 Nov 2023 • Yilun Kong, Jingqing Ruan, Yihong Chen, Bin Zhang, Tianpeng Bao, Shiwei Shi, Guoqing Du, Xiaoru Hu, Hangyu Mao, Ziyue Li, Xingyu Zeng, Rui Zhao
Large Language Models (LLMs) have demonstrated proficiency in addressing tasks that necessitate a combination of task planning and the usage of external tools that require a blend of task planning and the utilization of external tools, such as APIs.
no code implementations • 28 Oct 2023 • Guanghu Sui, Zhishuai Li, Ziyue Li, Sun Yang, Jingqing Ruan, Hangyu Mao, Rui Zhao
Our experiments with Large Language Models (LLMs) illustrate the significant performance improvement on the business dataset and prove the substantial potential of our method.
no code implementations • 7 Aug 2023 • Jingqing Ruan, Yihong Chen, Bin Zhang, Zhiwei Xu, Tianpeng Bao, Guoqing Du, Shiwei Shi, Hangyu Mao, Ziyue Li, Xingyu Zeng, Rui Zhao
With recent advancements in natural language processing, Large Language Models (LLMs) have emerged as powerful tools for various real-world applications.
no code implementations • 13 May 2023 • Bin Zhang, Hangyu Mao, Lijuan Li, Zhiwei Xu, Dapeng Li, Rui Zhao, Guoliang Fan
Our research contributes to the development of an effective and adaptable asynchronous action coordination method that can be widely applied to various task types and environmental configurations in MAS.
1 code implementation • 30 Dec 2022 • Hangyu Mao, Rui Zhao, Hao Chen, Jianye Hao, Yiqun Chen, Dong Li, Junge Zhang, Zhen Xiao
Recent methods combine the Transformer with these modules for better performance.
no code implementations • 17 Oct 2022 • Yiqun Chen, Hangyu Mao, Jiaxin Mao, Shiguang Wu, Tianle Zhang, Bin Zhang, Wei Yang, Hongxing Chang
Furthermore, we introduce a novel paradigm named Personalized Training with Distilled Execution (PTDE), wherein agent-personalized global information is distilled into the agent's local information.
no code implementations • 10 Mar 2022 • Xiaotian Hao, Hangyu Mao, Weixun Wang, Yaodong Yang, Dong Li, Yan Zheng, Zhen Wang, Jianye Hao
To break this curse, we propose a unified agent permutation framework that exploits the permutation invariance (PI) and permutation equivariance (PE) inductive biases to reduce the multiagent state space.
no code implementations • 17 Nov 2021 • Hangyu Mao, Chao Wang, Xiaotian Hao, Yihuan Mao, Yiming Lu, Chengjie WU, Jianye Hao, Dong Li, Pingzhong Tang
The MineRL competition is designed for the development of reinforcement learning and imitation learning algorithms that can efficiently leverage human demonstrations to drastically reduce the number of environment interactions needed to solve the complex \emph{ObtainDiamond} task with sparse rewards.
no code implementations • 29 Sep 2021 • Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu
In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.
no code implementations • 7 Jun 2021 • William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge Ritter, Chengjie WU, Xiaotian Hao, Yiming Lu, Hangyu Mao, Yihuan Mao, Chao Wang, Michal Opanowicz, Anssi Kanervisto, Yanick Schraner, Christian Scheller, Xiren Zhou, Lu Liu, Daichi Nishio, Toi Tsuneda, Karolis Ramanauskas, Gabija Juceviciute
Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field.
no code implementations • 1 Jun 2021 • Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao
In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.
no code implementations • ICLR Workshop SSL-RL 2021 • Changmin Yu, Dong Li, Hangyu Mao, Jianye Hao, Neil Burgess
Representation learning is a popular approach for reinforcement learning (RL) tasks with partially observable Markov decision processes.
1 code implementation • NeurIPS 2021 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang
We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.
no code implementations • 28 Sep 2020 • Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan
In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.
Open-Ended Question Answering Reinforcement Learning (RL) +1
no code implementations • ICLR 2018 • Hangyu Mao, Zhibo Gong, Zhen Xiao
In this paper, we study reward design problem in cooperative MARL based on packet routing environments.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 3 Dec 2019 • Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, Yan Ni
We evaluate the gating mechanism on several tasks.
no code implementations • 3 Dec 2019 • Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao
Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation.
no code implementations • 26 Feb 2019 • Hangyu Mao, Zhibo Gong, Zhengchao Zhang, Zhen Xiao, Yan Ni
Communication is an important factor for the big multi-agent world to stay organized and productive.
no code implementations • 13 Nov 2018 • Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong
Second, to model the teammates' policies using the collected information in an effective way, ATT-MADDPG enhances the centralized critic with an attention mechanism.
no code implementations • 10 Jun 2017 • Hangyu Mao, Zhibo Gong, Yan Ni, Zhen Xiao
Communication is a critical factor for the big multi-agent world to stay organized and productive.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • COLING 2016 • Yang Xiao, Yu-An Wang, Hangyu Mao, Zhen Xiao
Accurate prediction of user attributes from social media is valuable for both social science analysis and consumer targeting.