1 code implementation • 16 Dec 2023 • Ruining Zhang, Haoran Han, Maolong Lv, Qisong Yang, Jian Cheng
Extensive utilization of deep reinforcement learning (DRL) policy networks in diverse continuous control tasks has raised questions regarding performance degradation in expansive state spaces where the input state norm is larger than that in the training environment.
no code implementations • 26 Jul 2023 • Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, Matthijs T. J. Spaan
Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence of the guide as training progresses.