1 code implementation • 13 Nov 2022 • Yiwen Qiu, Jialong Wu, Zhangjie Cao, Mingsheng Long
Existing imitation learning works mainly assume that the demonstrator who collects demonstrations shares the same dynamics as the imitator.
1 code implementation • 27 Jun 2022 • Haoyi Niu, Shubham Sharma, Yiwen Qiu, Ming Li, Guyue Zhou, Jianming Hu, Xianyuan Zhan
This brings up a new question: is it possible to combine learning from limited real data in offline RL and unrestricted exploration through imperfect simulators in online RL to address the drawbacks of both approaches?