1 code implementation • 21 Jan 2022 • Balazs Varga, Balazs Kulcsar, Morteza Haghir Chehreghani
The role of the target network is overtaken by the control input, which also exploits the temporal dependency of samples (opposed to a randomized memory buffer).