Imitation Learning

521 papers with code • 0 benchmarks • 18 datasets

Imitation Learning is a framework for learning a behavior policy from demonstrations. Usually, demonstrations are presented in the form of state-action trajectories, with each pair indicating the action to take at the state being visited. In order to learn the behavior policy, the demonstrated actions are usually utilized in two ways. The first, known as Behavior Cloning (BC), treats the action as the target label for each state, and then learns a generalized mapping from states to actions in a supervised manner. Another way, known as Inverse Reinforcement Learning (IRL), views the demonstrated actions as a sequence of decisions, and aims at finding a reward/cost function under which the demonstrated decisions are optimal.

Finally, a newer methodology, Inverse Q-Learning aims at directly learning Q-functions from expert data, implicitly representing rewards, under which the optimal policy can be given as a Boltzmann distribution similar to soft Q-learning

Source: Learning to Imitate

Libraries

Use these libraries to find Imitation Learning models and implementations

Most implemented papers

Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

Kaixhin/imitation-learning ICLR 2019

We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.

Task-Embedded Control Networks for Few-Shot Imitation Learning

stepjam/TecNets 8 Oct 2018

Despite this, most robot learning approaches have focused on learning a single task, from scratch, with a limited notion of generalisation, and no way of leveraging the knowledge to learn other tasks more efficiently.

CompILE: Compositional Imitation Learning and Execution

tkipf/compile 4 Dec 2018

We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data.

Go-Explore: a New Approach for Hard-Exploration Problems

uber-research/go-explore 30 Jan 2019

Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.

Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

hiwonjoon/ICML2019-TREX 12 Apr 2019

A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to significantly outperform the demonstrator.

Simitate: A Hybrid Imitation Learning Benchmark

raphaelmemmesheimer/simitate 15 May 2019

We present Simitate --- a hybrid benchmarking suite targeting the evaluation of approaches for imitation learning.

A Divergence Minimization Perspective on Imitation Learning Methods

KamyarGh/rl_swiss 6 Nov 2019

We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.

Imitation Learning via Off-Policy Distribution Matching

google-research/google-research ICLR 2020

In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

ubisoft/ubisoft-la-forge-ASAF NeurIPS 2020

Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones -- and a generator's policy to produce trajectories that can fool this discriminator.

BabyAI 1.1

mila-iqia/babyai 24 Jul 2020

This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90. 4 %.