Search Results for author: Shangding Gu

Found 8 papers, 3 papers with code

TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning

no code implementations13 Mar 2024 Shangding Gu, Alois Knoll, Ming Jin

The development of Large Language Models (LLMs) often confronts challenges stemming from the heavy reliance on human annotators in the reinforcement learning with human feedback (RLHF) framework, or the frequent and costly external queries tied to the self-instruct paradigm.

reinforcement-learning Reinforcement Learning (RL)

Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study

no code implementations12 Jan 2024 Shangding Gu

In this study, we employ a teacher-student learning framework to tackle these problems, specifically by offering feedback for LLMs using RL models and providing high-level information for RL models with LLMs in a cooperative multi-agent setting.

Efficient Exploration Reinforcement Learning (RL)

Spreeze: High-Throughput Parallel Reinforcement Learning Framework

no code implementations11 Dec 2023 Jing Hou, Guang Chen, Ruiqi Zhang, Zhijun Li, Shangding Gu, Changjun Jiang

While existing parallel RL frameworks encompass a variety of RL algorithms and parallelization techniques, the excessively burdensome communication frameworks hinder the attainment of the hardware's limit for final throughput and training effects on a single desktop.

reinforcement-learning Reinforcement Learning (RL)

SCPO: Safe Reinforcement Learning with Safety Critic Policy Optimization

no code implementations1 Nov 2023 Jaafar Mhamed, Shangding Gu

In this study, we define the safety critic, a mechanism that nullifies rewards obtained through violating safety constraints.

Benchmarking reinforcement-learning +1

A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors

no code implementations25 Feb 2023 Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Jan Peters, Alois Knoll

Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment.

reinforcement-learning Reinforcement Learning (RL) +1

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

1 code implementation20 May 2022 Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll

To establish a good foundation for future research in this thread, in this paper, we provide a review for safe RL from the perspectives of methods, theory and applications.

Autonomous Driving Decision Making +3

Multi-Agent Constrained Policy Optimisation

3 code implementations6 Oct 2021 Shangding Gu, Jakub Grudzien Kuba, Munning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois Knoll, Yaodong Yang

To fill these gaps, in this work, we formulate the safe MARL problem as a constrained Markov game and solve it with policy optimisation methods.

Multi-agent Reinforcement Learning reinforcement-learning +1

Settling the Variance of Multi-Agent Policy Gradients

1 code implementation NeurIPS 2021 Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents.

Reinforcement Learning (RL) Starcraft

Cannot find the paper you are looking for? You can Submit a new open access paper.