Search Results for author: Huaqing Xiong

Found 8 papers, 0 papers with code

Faster Non-asymptotic Convergence for Double Q-learning

no code implementations NeurIPS 2021 Lin Zhao, Huaqing Xiong, Yingbin Liang

This paper tackles the more challenging case of a constant learning rate, and develops new analytical tools that improve the existing convergence rate by orders of magnitude.

Q-Learning

Double Q-learning: New Analysis and Sharper Finite-time Bound

no code implementations1 Jan 2021 Lin Zhao, Huaqing Xiong, Yingbin Liang, Wei zhang

Double Q-learning (Hasselt 2010) has gained significant success in practice due to its effectiveness in overcoming the overestimation issue of Q-learning.

Q-Learning

Finite-Time Analysis for Double Q-learning

no code implementations NeurIPS 2020 Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei zhang

Although Q-learning is one of the most successful algorithms for finding the best action-value function (and thus the optimal policy) in reinforcement learning, its implementation often suffers from large overestimation of Q-function values incurred by random sampling.

Q-Learning

Momentum Q-learning with Finite-Sample Convergence Guarantee

no code implementations30 Jul 2020 Bowen Weng, Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei zhang

For the infinite state-action space case, we establish the convergence guarantee for MomentumQ with linear function approximations and Markovian sampling.

Q-Learning

Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent

no code implementations15 Jul 2020 Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei zhang

In this paper, we first characterize the convergence rate for Q-AMSGrad, which is the Q-learning algorithm with AMSGrad update (a commonly adopted alternative of Adam for theoretical analysis).

Atari Games Q-Learning

Non-asymptotic Convergence of Adam-type Reinforcement Learning Algorithms under Markovian Sampling

no code implementations15 Feb 2020 Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei zhang

Despite the wide applications of Adam in reinforcement learning (RL), the theoretical convergence of Adam-type RL algorithms has not been established.

reinforcement-learning Reinforcement Learning (RL)

CAN ALTQ LEARN FASTER: EXPERIMENTS AND THEORY

no code implementations25 Sep 2019 Bowen Weng, Huaqing Xiong, Yingbin Liang, Wei zhang

Differently from the popular Deep Q-Network (DQN) learning, Alternating Q-learning (AltQ) does not fully fit a target Q-function at each iteration, and is generally known to be unstable and inefficient.

Atari Games Q-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.