Search Results for author: Wanpeng Zhang

Found 10 papers, 3 papers with code

AdaRefiner: Refining Decisions of Language Models with Adaptive Feedback

1 code implementation • 29 Sep 2023 • Wanpeng Zhang, Zongqing Lu

Large Language Models (LLMs) have demonstrated significant success across various domains.

Common Sense Reasoning Decision Making +4

Paper
Code

Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation

no code implementations • 5 Jun 2023 • Wanpeng Zhang, Yilin Li, Boyu Yang, Zongqing Lu

COREP primarily employs a guided updating mechanism to learn a stable graph representation for states termed as causal-origin representation.

reinforcement-learning

Paper
Add Code

Entity Divider with Language Grounding in Multi-Agent Reinforcement Learning

no code implementations • 25 Oct 2022 • Ziluo Ding, Wanpeng Zhang, Junpeng Yue, Xiangjun Wang, Tiejun Huang, Zongqing Lu

We investigate the use of natural language to drive the generalization of policies in multi-agent settings.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Robust Model-based Reinforcement Learning for Autonomous Greenhouse Control

no code implementations • 26 Aug 2021 • Wanpeng Zhang, Xiaoyan Cao, Yao Yao, Zhicheng An, Xi Xiao, Dijun Luo

In this paper, we present a model-based robust RL framework for autonomous greenhouse control to meet the sample efficiency and safety challenges.

Decision Making Model-based Reinforcement Learning +2

Paper
Add Code

Model-Based Opponent Modeling

no code implementations • 4 Aug 2021 • Xiaopeng Yu, Jiechuan Jiang, Wanpeng Zhang, Haobin Jiang, Zongqing Lu

When one agent interacts with a multi-agent environment, it is challenging to deal with various opponents unseen before.

Paper
Add Code

MBDP: A Model-based Approach to Achieve both Robustness and Sample Efficiency via Double Dropout Planning

no code implementations • 3 Aug 2021 • Wanpeng Zhang, Xi Xiao, Yao Yao, Mingzhe Chen, Dijun Luo

MBDP consists of two kinds of dropout mechanisms, where the rollout-dropout aims to improve the robustness with a small cost of sample efficiency, while the model-dropout is designed to compensate for the lost efficiency at a slight expense of robustness.

Model-based Reinforcement Learning

Paper
Add Code

IGrow: A Smart Agriculture Solution to Autonomous Greenhouse Control

1 code implementation • 6 Jul 2021 • Xiaoyan Cao, Yao Yao, Lanqing Li, Wanpeng Zhang, Zhicheng An, Zhong Zhang, Li Xiao, Shihui Guo, Xiaoyu Cao, Meihong Wu, Dijun Luo

However, the optimal control of autonomous greenhouses is challenging, requiring decision-making based on high-dimensional sensory data, and the scaling of production is limited by the scarcity of labor capable of handling this task.

Cloud Computing Decision Making