1 code implementation • 11 Oct 2022 • Julius Wagenbach, Matthia Sabatelli
We study whether the learning rate $\alpha$, the discount factor $\gamma$ and the reward signal $r$ have an influence on the overestimation bias of the Q-Learning algorithm.
Q-Learning