no code implementations • 30 Jul 2020 • Bowen Weng, Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei zhang
For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling.
no code implementations • 15 Jul 2020 • Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei zhang
In this paper, we first characterize the convergence rate for Q-AMSGrad, which is the Q-learning algorithm with AMSGrad update (a commonly adopted alternative of Adam for theoretical analysis).
no code implementations • ICML 2020 • Kaiyi Ji, Zhe Wang, Bowen Weng, Yi Zhou, Wei zhang, Yingbin Liang
In this paper, we propose a novel scheme, which eliminates backtracking line search but still exploits the information along optimization path by adapting the batch size via history stochastic gradients.
no code implementations • 3 Oct 2019 • Guillermo A. Castillo, Bowen Weng, Wei zhang, Ayonga Hereid
This paper presents a novel model-free reinforcement learning (RL) framework to design feedback control policies for 3D bipedal walking.
no code implementations • 25 Sep 2019 • Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei zhang
Differently from the popular Deep Q-Network (DQN) learning, Alternating Q-learning (AltQ) does not fully fit a target Q-function at each iteration, and is generally known to be unstable and inefficient.
no code implementations • 7 May 2019 • Bowen Weng, Huaqing Xiong, Wei zhang
This paper studies accelerations in Q-learning algorithms.