Search Results for author: Xingzhou Lou

Found 4 papers, 2 papers with code

SPO: Multi-Dimensional Preference Sequential Alignment With Implicit Reward Modeling

no code implementations21 May 2024 Xingzhou Lou, Junge Zhang, Jian Xie, Lifeng Liu, Dong Yan, Kaiqi Huang

Human preference alignment is critical in building powerful and reliable large language models (LLMs).

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

no code implementations15 Jan 2024 Xingzhou Lou, Junge Zhang, Ziyan Wang, Kaiqi Huang, Yali Du

Through the use of pre-trained LMs and the elimination of the need for a ground-truth cost, our method enhances safe policy learning under a diverse set of human-derived free-form natural language constraints.

Reinforcement Learning (RL) Safe Reinforcement Learning

TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient

1 code implementation25 Dec 2023 Xingzhou Lou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, Yali Du

We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods.

PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination

1 code implementation16 Jan 2023 Xingzhou Lou, Jiaxian Guo, Junge Zhang, Jun Wang, Kaiqi Huang, Yali Du

We conduct experiments on the Overcooked environment, and evaluate the zero-shot human-AI coordination performance of our method with both behavior-cloned human proxies and real humans.

Cannot find the paper you are looking for? You can Submit a new open access paper.