Bayesian Reward Extrapolation

Introduced by Brown et al. in Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Bayesian Reward Extrapolation is a Bayesian reward learning algorithm that scales to high-dimensional imitation learning problems by pre-training a low-dimensional feature encoding via self-supervised tasks and then leveraging preferences over demonstrations to perform fast Bayesian inference.

Source: Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Atari Games	1	50.00%
Imitation Learning	1	50.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Bayesian Reinforcement Learning