Bayesian Reward Extrapolation is a Bayesian reward learning algorithm that scales to high-dimensional imitation learning problems by pre-training a low-dimensional feature encoding via self-supervised tasks and then leveraging preferences over demonstrations to perform fast Bayesian inference.
Source: Safe Imitation Learning via Fast Bayesian Reward Inference from PreferencesPaper | Code | Results | Date | Stars |
---|
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |