no code implementations • 26 Feb 2024 • Junzhe Chen, Xuming Hu, Shuodi Liu, Shiyu Huang, Wei-Wei Tu, Zhaofeng He, Lijie Wen
Recent advancements in large language models (LLMs) have revealed their potential for achieving autonomous agents possessing human-level intelligence.
1 code implementation • 16 Feb 2024 • Yiwen Sun, Xianyin Zhang, Shiyu Huang, Shaowei Cai, Bing-Zhen Zhang, Ke Wei
Heuristics are crucial in SAT solvers, while no heuristic rules are suitable for all problem instances.
1 code implementation • 20 Dec 2023 • Shiyu Huang, Wentse Chen, Yiwen Sun, Fuqing Bie, Wei-Wei Tu
We present OpenRL, an advanced reinforcement learning (RL) framework designed to accommodate a diverse array of tasks, from single-agent challenges to complex multi-agent systems.
1 code implementation • 5 Sep 2023 • Haixu Song, Shiyu Huang, Yinpeng Dong, Wei-Wei Tu
The rise of deepfake images, especially of well-known personalities, poses a serious threat to the dissemination of authentic information.
1 code implementation • 23 Aug 2023 • Fanqi Lin, Shiyu Huang, WeiWei Tu
Under such a framework, we also propose a provably efficient diversity reinforcement learning algorithm.
no code implementations • NeurIPS 2023 • Bill Yuchen Lin, Yicheng Fu, Karina Yang, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Prithviraj Ammanabrolu, Yejin Choi, Xiang Ren
The Swift module is a small encoder-decoder LM fine-tuned on the oracle agent's action trajectories, while the Sage module employs LLMs such as GPT-4 for subgoal planning and grounding.
1 code implementation • 15 Feb 2023 • Fanqi Lin, Shiyu Huang, Tim Pearce, Wenze Chen, Wei-Wei Tu
Multi-agent football poses an unsolved challenge in AI research.
1 code implementation • 8 Feb 2023 • Xinyi Yang, Shiyu Huang, Yiwen Sun, Yuxiang Yang, Chao Yu, Wei-Wei Tu, Huazhong Yang, Yu Wang
Goal-conditioned hierarchical reinforcement learning (HRL) provides a promising direction to tackle this challenge by introducing a hierarchical structure to decompose the search space, where the low-level policy predicts primitive actions in the guidance of the goals derived from the high-level policy.
Hierarchical Reinforcement Learning Multi-agent Reinforcement Learning +2
1 code implementation • 12 Jul 2022 • Wentse Chen, Shiyu Huang, Yuan Chiang, Tim Pearce, Wei-Wei Tu, Ting Chen, Jun Zhu
We propose Diversity-Guided Policy Optimization (DGPO), an on-policy algorithm that discovers multiple strategies for solving a given task.
1 code implementation • 9 Oct 2021 • Shiyu Huang, Wenze Chen, Longfei Zhang, Shizhen Xu, Ziyang Li, Fengming Zhu, Deheng Ye, Ting Chen, Jun Zhu
To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game, while previous work could either control a single agent or experiment on toy academic scenarios.
1 code implementation • 8 Oct 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Ting Chen, Jun Zhu
In this work, we propose a new algorithm for circuit routing, named Ranking Cost, which innovatively combines search-based methods (i. e., A* algorithm) and learning-based methods (i. e., Evolution Strategies) to form an efficient and trainable router.
no code implementations • 1 Jan 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Jun Zhu, Ting Chen
In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths to achieve the global object.
no code implementations • ICLR 2020 • Shiyu Huang, Hang Su, Jun Zhu, Ting Chen
Partially Observable Markov Decision Processes (POMDPs) are popular and flexible models for real-world decision-making applications that demand the information from past observations to make optimal decisions.
1 code implementation • CVPR 2017 • Shiyu Huang, Deva Ramanan
Such "in-the-tail" data is notoriously hard to observe, making both training and testing difficult.