Is Q-Learning Provably Efficient? An Extended Analysis

22 Sep 2020 · Kushagra Rastogi, Jonathan Lee, Fabrice Harel-Canada, Aditya Joglekar ·

This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the critical steps leading to the main result showing that Q-learning with UCB exploration achieves a sample efficiency that matches the optimal regret that can be achieved by any model-based approach.

PDF Abstract