Learning from Noisy Demonstration Sets via Meta-Learned Suitability Assessor

ICLR 2019 · Te-Lin Wu, Jaedong Hwang, Jingyun Yang, Shaofan Lai, Carl Vondrick, Joseph J. Lim ·

A noisy and diverse demonstration set may hinder the performances of an agent aiming to acquire certain skills via imitation learning. However, state-of-the-art imitation learning algorithms often assume the optimality of the given demonstration set. In this paper, we address such optimal assumption by learning only from the most suitable demonstrations in a given set. Suitability of a demonstration is estimated by whether imitating it produce desirable outcomes for achieving the goals of the tasks. For more efficient demonstration suitability assessments, the learning agent should be capable of imitating a demonstration as quick as possible, which shares similar spirit with fast adaptation in the meta-learning regime. Our framework, thus built on top of Model-Agnostic Meta-Learning, evaluates how desirable the imitated outcomes are, after adaptation to each demonstration in the set. The resulting assessments hence enable us to select suitable demonstration subsets for acquiring better imitated skills. The videos related to our experiments are available at: https://sites.google.com/view/deepdj

PDF Abstract