Self-correcting Q-Learning

2 Dec 2020 Rong Zhu Mattia Rigotti

The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an efficient algorithm to mitigate this bias... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Experience Replay
Replay Memory
Convolution
Convolutions
Double DQN
Q-Learning Networks
Dense Connections
Feedforward Networks
DQN
Q-Learning Networks
Double Q-learning
Off-Policy TD Control
Q-Learning
Off-Policy TD Control