Search Results for author: Haifeng Zhang

Found 22 papers, 9 papers with code

Token-level Direct Preference Optimization

1 code implementation • 18 Apr 2024 • Yongcheng Zeng, Guoqing Liu, Weiyu Ma, Ning Yang, Haifeng Zhang, Jun Wang

Fine-tuning pre-trained Large Language Models (LLMs) is essential to align them with human values and intentions.

Paper
Code

Learning Macroeconomic Policies based on Microfoundations: A Stackelberg Mean Field Game Approach

no code implementations • 14 Mar 2024 • Qirui Mi, Zhiyu Zhao, Siyu Xia, Yan Song, Jun Wang, Haifeng Zhang

Effective macroeconomic policies play a crucial role in promoting economic growth and social stability.

Paper
Add Code

Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach

1 code implementation • 19 Dec 2023 • Weiyu Ma, Qirui Mi, Xue Yan, Yuqiao Wu, Runji Lin, Haifeng Zhang, Jun Wang

StarCraft II is a challenging benchmark for AI agents due to the necessity of both precise micro level operations and strategic macro awareness.

Language Modelling Large Language Model +2

143

Paper
Code

AI-Based Energy Transportation Safety: Pipeline Radial Threat Estimation Using Intelligent Sensing System

no code implementations • 18 Dec 2023 • Chengyuan Zhu, Yiyuan Yang, Kaixiang Yang, Haifeng Zhang, Qinmin Yang, C. L. Philip Chen

This refinement is crucial in effectively identifying genuine threats to pipelines, thus enhancing the safety of energy transportation.

Transfer Learning

Paper
Add Code

Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

no code implementations • 27 Oct 2023 • Xue Yan, Yan Song, Xinyu Cui, Filippos Christianos, Haifeng Zhang, David Henry Mguni, Jun Wang

To that purpose, we offer a new leader-follower bilevel framework that is capable of learning to ask relevant questions (prompts) and subsequently undertaking reasoning to guide the learning of actions.

Decision Making

Paper
Add Code

Large Sequence Models for Sequential Decision-Making: A Survey

no code implementations • 24 Jun 2023 • Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.

Decision Making

Paper
Add Code

An Empirical Study on Google Research Football Multi-agent Scenarios

1 code implementation • 16 May 2023 • Yan Song, He Jiang, Zheng Tian, Haifeng Zhang, Yingping Zhang, Jiangcheng Zhu, Zonghong Dai, Weinan Zhang, Jun Wang

Few multi-agent reinforcement learning (MARL) research on Google Research Football (GRF) focus on the 11v11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public.

Benchmarking Multi-agent Reinforcement Learning +1

Paper
Code

Contextual Transformer for Offline Meta Reinforcement Learning

no code implementations • 15 Nov 2022 • Runji Lin, Ye Li, Xidong Feng, Zhaowei Zhang, Xian Hong Wu Fung, Haifeng Zhang, Jun Wang, Yali Du, Yaodong Yang

Firstly, we propose prompt tuning for offline RL, where a context vector sequence is concatenated with the input to guide the conditional policy generation.

D4RL Meta Reinforcement Learning +4

Paper
Add Code

Learning to Identify Top Elo Ratings: A Dueling Bandits Approach

1 code implementation • 12 Jan 2022 • Xue Yan, Yali Du, Binxin Ru, Jun Wang, Haifeng Zhang, Xu Chen

The Elo rating system is widely adopted to evaluate the skills of (chess) game and sports players.

Scheduling

Paper
Code

A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning

1 code implementation • 31 Dec 2021 • Xidong Feng, Bo Liu, Jie Ren, Luo Mai, Rui Zhu, Haifeng Zhang, Jun Wang, Yaodong Yang

Gradient-based Meta-RL (GMRL) refers to methods that maintain two-level optimisation procedures wherein the outer-loop meta-learner guides the inner-loop gradient-based reinforcement learner to achieve fast adaptations.

Atari Games Meta Reinforcement Learning +3

Paper
Code

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

1 code implementation • 6 Dec 2021 • Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.

Offline RL reinforcement-learning +4

Paper
Code

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

no code implementations • 28 Oct 2021 • Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang

In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problem (TSP).

Traveling Salesman Problem

Paper
Add Code

Offline Pre-trained Multi-Agent Decision Transformer

no code implementations • 29 Sep 2021 • Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xi yun Li, Haifeng Zhang, Ying Wen, Weinan Zhang, Jun Wang, Bo Xu

Offline reinforcement learning leverages static datasets to learn optimal policies with no necessity to access the environment.

Multi-agent Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Settling the Variance of Multi-Agent Policy Gradients

1 code implementation • NeurIPS 2021 • Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents.

Reinforcement Learning (RL) Starcraft

Paper
Code

Learning Predictive Communication by Imagination in Networked System Control

no code implementations • 1 Jan 2021 • Yali Du, Yifan Zhao, Meng Fang, Jun Wang, Gangyan Xu, Haifeng Zhang

Dealing with multi-agent control in networked systems is one of the biggest challenges in Reinforcement Learning (RL) and limited success has been presented compared to recent deep reinforcement learning in single-agent domain.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Improving Knowledge Tracing via Pre-training Question Embeddings

1 code implementation • 9 Dec 2020 • Yunfei Liu, Yang Yang, Xianyu Chen, Jian Shen, Haifeng Zhang, Yong Yu

Knowledge tracing (KT) defines the task of predicting whether students can correctly answer questions based on their historical response.

Ranked #3 on Knowledge Tracing on EdNet

Knowledge Tracing

Paper
Code

Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning

no code implementations • 10 Sep 2019 • Liheng Chen, Hongyi Guo, Yali Du, Fei Fang, Haifeng Zhang, Yaoming Zhu, Ming Zhou, Wei-Nan Zhang, Qing Wang, Yong Yu

Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Bi-level Actor-Critic for Multi-agent Coordination

1 code implementation • 8 Sep 2019 • Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Wei-Nan Zhang, Jun Wang

Coordination is one of the essential problems in multi-agent systems.

Multiagent Systems

Paper
Code

Layout Design for Intelligent Warehouse by Evolution with Fitness Approximation

no code implementations • 14 Nov 2018 • Haifeng Zhang, Zilong Guo, Han Cai, Chris Wang, Wei-Nan Zhang, Yong Yu, Wenxin Li, Jun Wang

With the rapid growth of the express industry, intelligent warehouses that employ autonomous robots for carrying parcels have been widely used to handle the vast express volume.

Layout Design

Paper
Add Code

ICFVR 2017: 3rd International Competition on Finger Vein Recognition

no code implementations • 4 Jan 2018 • Yi Zhang, Houjun Huang, Haifeng Zhang, Liao Ni, Wei Xu, Nasir Uddin Ahmed, Md. Shakil Ahmed, Yilun Jin, Yingjie Chen, Jingxuan Wen, Wenxin Li

The development of finger vein recognition algorithms heavily depends on large-scale real-world data sets.

Finger Vein Recognition

Paper
Add Code

Learning to Design Games: Strategic Environments in Reinforcement Learning

no code implementations • 5 Jul 2017 • Haifeng Zhang, Jun Wang, Zhiming Zhou, Wei-Nan Zhang, Ying Wen, Yong Yu, Wenxin Li

In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Empirically Grounded Agent-Based Models of Innovation Diffusion: A Critical Review

no code implementations • 30 Aug 2016 • Haifeng Zhang, Yevgeniy Vorobeychik

Innovation diffusion has been studied extensively in a variety of disciplines, including sociology, economics, marketing, ecology, and computer science.

Marketing Sociology

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.