no code implementations • 30 Jan 2024 • Ryoma Furuyama, Daiki Kuyoshi, Satoshi Yamane
In order to make this algorithm more robust to distribution shift, we propose more efficient and robust algorithm by adding to this method a reward function based on adversarial inverse reinforcement learning that rewards the agent for performing actions in status similar to the demo.