Variational Model-Based Imitation Learning in High-Dimensional Observation Spaces

We consider the problem setting of imitation learning where the agent is provided a fixed dataset of demonstrations. While the agent can interact with the environment for exploration, it is oblivious to the reward function used by the demonstrator. This setting is representative of many applications in robotics where task demonstrations may be straightforward while reward shaping or conveying stylistic aspects of human motion may be difficult. For this setting, we develop a variational model-based imitation learning algorithm (VMIL) that is capable of learning policies from visual observations. Through experiments, we find that VMIL is more sample efficient compared to prior algorithms in several challenging vision-based locomotion and manipulation tasks, including a high-dimensional in-hand dexterous manipulation task.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here