Search Results for author: Chengqian Gao

Found 2 papers, 1 papers with code

Robust Offline Reinforcement Learning with Gradient Penalty and Constraint Relaxation

1 code implementation • 19 Oct 2022 • Chengqian Gao, Ke Xu, Liu Liu, Deheng Ye, Peilin Zhao, Zhiqiang Xu

A promising paradigm for offline reinforcement learning (RL) is to constrain the learned policy to stay close to the dataset behaviors, known as policy constraint offline RL.

D4RL Offline RL +2

Paper
Code

Value Penalized Q-Learning for Recommender Systems

no code implementations • 15 Oct 2021 • Chengqian Gao, Ke Xu, Kuangqi Zhou, Lanqing Li, Xueqian Wang, Bo Yuan, Peilin Zhao

To alleviate the action distribution shift problem in extracting RL policy from static trajectories, we propose Value Penalized Q-learning (VPQ), an uncertainty-based offline RL algorithm.

Offline RL Q-Learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.