Variational Model-Based Imitation Learning in High-Dimensional Observation Spaces

ICLR Workshop SSL-RL 2021 · Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn ·

We consider the problem setting of imitation learning where the agent is provided a fixed dataset of demonstrations. While the agent can interact with the environment for exploration, it is oblivious to the reward function used by the demonstrator. This setting is representative of many applications in robotics where task demonstrations may be straightforward while reward shaping or conveying stylistic aspects of human motion may be difficult. For this setting, we develop a variational model-based imitation learning algorithm (VMIL) that is capable of learning policies from visual observations. Through experiments, we find that VMIL is more sample efficient compared to prior algorithms in several challenging vision-based locomotion and manipulation tasks, including a high-dimensional in-hand dexterous manipulation task.

PDF Abstract