Search Results for author: Chengqian Gao

Found 3 papers, 2 papers with code

Hard-Thresholding Meets Evolution Strategies in Reinforcement Learning

1 code implementation2 May 2024 Chengqian Gao, William de Vazelhes, Hualin Zhang, Bin Gu, Zhiqiang Xu

Evolution Strategies (ES) have emerged as a competitive alternative for model-free reinforcement learning, showcasing exemplary performance in tasks like Mujoco and Atari.

Decision Making reinforcement-learning

Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation

1 code implementation19 Oct 2022 Chengqian Gao, Ke Xu, Liu Liu, Deheng Ye, Peilin Zhao, Zhiqiang Xu

A promising paradigm for offline reinforcement learning (RL) is to constrain the learned policy to stay close to the dataset behaviors, known as policy constraint offline RL.

D4RL Offline RL +2

Value Penalized Q-Learning for Recommender Systems

no code implementations15 Oct 2021 Chengqian Gao, Ke Xu, Kuangqi Zhou, Lanqing Li, Xueqian Wang, Bo Yuan, Peilin Zhao

To alleviate the action distribution shift problem in extracting RL policy from static trajectories, we propose Value Penalized Q-learning (VPQ), an uncertainty-based offline RL algorithm.

Offline RL Q-Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.