no code implementations • 17 Mar 2024 • Yudong Luo, Yangchen Pan, Han Wang, Philip Torr, Pascal Poupart
Reinforcement learning algorithms utilizing policy gradients (PG) to optimize Conditional Value at Risk (CVaR) face significant challenges with sample inefficiency, hindering their practical applications.
2 code implementations • 20 Jun 2022 • Guiliang Liu, Yudong Luo, Ashish Gaurav, Kasra Rezaee, Pascal Poupart
When deploying Reinforcement Learning (RL) agents into a physical system, we must ensure that these agents are well aware of the underlying constraints.
no code implementations • ICLR 2022 • Yudong Luo, Guiliang Liu, Haonan Duan, Oliver Schulte, Pascal Poupart
Distributional Reinforcement Learning (RL) differs from traditional RL by estimating the distribution over returns to capture the intrinsic uncertainty of MDPs.
Distributional Reinforcement Learning reinforcement-learning +1
1 code implementation • 12 Sep 2021 • Ziyuan Ma, Yudong Luo, Jia Pan
Learning communication via deep reinforcement learning (RL) or imitation learning (IL) has recently been shown to be an effective way to solve Multi-Agent Path Finding (MAPF).
1 code implementation • 21 Jun 2021 • Ziyuan Ma, Yudong Luo, Hang Ma
The final trained policy is applied to each agent for decentralized execution.