Search Results for author: Zongzhang Zhang

Found 24 papers, 11 papers with code

Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation

1 code implementation12 Mar 2024 Chengxing Jia, Fuxiang Zhang, Yi-Chen Li, Chen-Xiao Gao, Xu-Hui Liu, Lei Yuan, Zongzhang Zhang, Yang Yu

Specifically, the objective of adversarial data augmentation is not merely to generate data analogous to offline data distribution; instead, it aims to create adversarial examples designed to confound learned task representations and lead to incorrect task identification.

Contrastive Learning Data Augmentation +3

Reinforced In-Context Black-Box Optimization

1 code implementation27 Feb 2024 Lei Song, Chenxiao Gao, Ke Xue, Chenyang Wu, Dong Li, Jianye Hao, Zongzhang Zhang, Chao Qian

In this paper, we propose RIBBO, a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion.

In-Context Learning Meta-Learning

Debiased Offline Representation Learning for Fast Online Adaptation in Non-stationary Dynamics

no code implementations17 Feb 2024 Xinyu Zhang, Wenjie Qiu, Yi-Chen Li, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu

DORA incorporates an information bottleneck principle that maximizes mutual information between the dynamics encoding and the environmental data, while minimizing mutual information between the dynamics encoding and the actions of the behavior policy.

Representation Learning

ACT: Empowering Decision Transformer with Dynamic Programming via Advantage Conditioning

1 code implementation12 Sep 2023 Chen-Xiao Gao, Chenyang Wu, Mingjun Cao, Rui Kong, Zongzhang Zhang, Yang Yu

Third, we train an Advantage-Conditioned Transformer (ACT) to generate actions conditioned on the estimated advantages.

Action Generation

Policy Regularization with Dataset Constraint for Offline Reinforcement Learning

2 code implementations11 Jun 2023 Yuhang Ran, Yi-Chen Li, Fuxiang Zhang, Zongzhang Zhang, Yang Yu

A common taxonomy of existing offline RL works is policy regularization, which typically constrains the learned policy by distribution or support of the behavior policy.

Offline RL reinforcement-learning +1

Language Model Self-improvement by Reinforcement Learning Contemplation

no code implementations23 May 2023 Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu

We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.

Language Modelling Machine Translation +3

Robust Multi-agent Communication via Multi-view Message Certification

no code implementations7 May 2023 Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu

Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment.

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement

1 code implementation3 Mar 2023 Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang, Yang Yu

In this paper, we propose a novel active imitation learning framework based on a teacher-student interaction model, in which the teacher's goal is to identify the best teaching behavior and actively affect the student's learning process.

Atari Games Imitation Learning

Retrosynthetic Planning with Dual Value Networks

1 code implementation31 Jan 2023 Guoqing Liu, Di Xue, Shufang Xie, Yingce Xia, Austin Tripp, Krzysztof Maziarz, Marwin Segler, Tao Qin, Zongzhang Zhang, Tie-Yan Liu

Retrosynthesis, which aims to find a route to synthesize a target molecule from commercially available starting materials, is a critical task in drug discovery and materials design.

Drug Discovery Multi-step retrosynthesis +2

Multi-agent Dynamic Algorithm Configuration

1 code implementation13 Oct 2022 Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu

MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm.

Multi-Armed Bandits Reinforcement Learning (RL)

Deep Anomaly Detection and Search via Reinforcement Learning

no code implementations31 Aug 2022 Chao Chen, Dawei Wang, Feng Mao, Zongzhang Zhang, Yang Yu

Semi-supervised Anomaly Detection (AD) is a kind of data mining task which aims at learning features from partially-labeled datasets to help detect outliers.

Ensemble Learning Partially Labeled Datasets +4

Model Generation with Provable Coverability for Offline Reinforcement Learning

no code implementations1 Jun 2022 Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu

Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.

Offline RL Out-of-Distribution Generalization +2

Multi-Agent Policy Transfer via Task Relationship Modeling

no code implementations9 Mar 2022 Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Zongzhang Zhang, Chongjie Zhang, Yang Yu

We demonstrate that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.

Transfer Learning

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu

Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.

Domain Adaptation reinforcement-learning +1

Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets

no code implementations19 May 2020 Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji, Wulong Liu

Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning.

Autonomous Vehicles Data Augmentation +1

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

1 code implementation19 Feb 2020 Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Monte-Carlo Tree Search for Policy Optimization

no code implementations23 Dec 2019 Xiaobai Ma, Katherine Driggs-Campbell, Zongzhang Zhang, Mykel J. Kochenderfer

Gradient-based methods are often used for policy optimization in deep reinforcement learning, despite being vulnerable to local optima and saddle points.

reinforcement-learning Reinforcement Learning (RL)

A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents

no code implementations NeurIPS 2018 Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan

In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

no code implementations25 Sep 2018 Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang

Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.