Imitation Learning
521 papers with code • 0 benchmarks • 18 datasets
Imitation Learning is a framework for learning a behavior policy from demonstrations. Usually, demonstrations are presented in the form of state-action trajectories, with each pair indicating the action to take at the state being visited. In order to learn the behavior policy, the demonstrated actions are usually utilized in two ways. The first, known as Behavior Cloning (BC), treats the action as the target label for each state, and then learns a generalized mapping from states to actions in a supervised manner. Another way, known as Inverse Reinforcement Learning (IRL), views the demonstrated actions as a sequence of decisions, and aims at finding a reward/cost function under which the demonstrated decisions are optimal.
Finally, a newer methodology, Inverse Q-Learning aims at directly learning Q-functions from expert data, implicitly representing rewards, under which the optimal policy can be given as a Boltzmann distribution similar to soft Q-learning
Source: Learning to Imitate
Benchmarks
These leaderboards are used to track progress in Imitation Learning
Libraries
Use these libraries to find Imitation Learning models and implementationsDatasets
Most implemented papers
Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.
Task-Embedded Control Networks for Few-Shot Imitation Learning
Despite this, most robot learning approaches have focused on learning a single task, from scratch, with a limited notion of generalisation, and no way of leveraging the knowledge to learn other tasks more efficiently.
CompILE: Compositional Imitation Learning and Execution
We introduce Compositional Imitation Learning and Execution (CompILE): a framework for learning reusable, variable-length segments of hierarchically-structured behavior from demonstration data.
Go-Explore: a New Approach for Hard-Exploration Problems
Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.
Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to significantly outperform the demonstrator.
Simitate: A Hybrid Imitation Learning Benchmark
We present Simitate --- a hybrid benchmarking suite targeting the evaluation of approaches for imitation learning.
A Divergence Minimization Perspective on Imitation Learning Methods
We present $f$-MAX, an $f$-divergence generalization of AIRL [Fu et al., 2018], a state-of-the-art IRL method.
Imitation Learning via Off-Policy Distribution Matching
In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.
Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones -- and a generator's policy to produce trajectories that can fool this discriminator.
BabyAI 1.1
This increases reinforcement learning sample efficiency by up to 3 times and improves imitation learning performance on the hardest level from 77 % to 90. 4 %.