no code implementations • 2 Mar 2023 • Archit Sharma, Ahmed M. Ahmed, Rehaan Ahmad, Chelsea Finn
In this work, we propose MEDAL++, a novel design for self-improving robotic systems: given a small set of expert demonstrations at the start, the robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
1 code implementation • 11 May 2022 • Archit Sharma, Rehaan Ahmad, Chelsea Finn
Prior works have considered an alternating approach where a forward policy learns to solve the task and the backward policy learns to reset the environment, but what initial state distribution should the backward policy reset the agent to?