Search Results for author: Simon Sinong Zhan

Found 4 papers, 2 papers with code

Variational Delayed Policy Optimization

no code implementations • 23 May 2024 • Qingyuan Wu, Simon Sinong Zhan, YiXuan Wang, Yuhui Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Chao Huang

In environments with delayed observation, state augmentation by including actions within the delay window is adopted to retrieve Markovian property to enable reinforcement learning (RL).

Reinforcement Learning (RL) Variational Inference

Paper
Add Code

Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays

1 code implementation • 5 Feb 2024 • Qingyuan Wu, Simon Sinong Zhan, YiXuan Wang, Yuhui Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Jürgen Schmidhuber, Chao Huang

To address these challenges, we present a novel Auxiliary-Delayed Reinforcement Learning (AD-RL) method that leverages auxiliary tasks involving short delays to accelerate RL with long delays, without compromising performance in stochastic environments.

reinforcement-learning

Paper
Code

State-Wise Safe Reinforcement Learning With Pixel Observations

1 code implementation • 3 Nov 2023 • Simon Sinong Zhan, YiXuan Wang, Qingyuan Wu, Ruochen Jiao, Chao Huang, Qi Zhu

In the context of safe exploration, Reinforcement Learning (RL) has long grappled with the challenges of balancing the tradeoff between maximizing rewards and minimizing safety violations, particularly in complex environments with contact-rich or non-smooth dynamics, and when dealing with high-dimensional pixel observations.

reinforcement-learning Reinforcement Learning (RL) +2

Paper
Code

Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments

no code implementations • 29 Sep 2022 • YiXuan Wang, Simon Sinong Zhan, Ruochen Jiao, Zhilu Wang, Wanxin Jin, Zhuoran Yang, Zhaoran Wang, Chao Huang, Qi Zhu

It is quite challenging to ensure the safety of reinforcement learning (RL) agents in an unknown and stochastic environment under hard constraints that require the system state not to reach certain specified unsafe regions.

Reinforcement Learning (RL) Safe Reinforcement Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.