no code implementations • 29 Jan 2024 • Federco Malato, Florian Leopold, Andrew Melnik, Ville Hautamaki
Behavioral cloning uses a dataset of demonstrations to learn a policy.
no code implementations • 15 Jun 2023 • Federico Malato, Florian Leopold, Ville Hautamaki, Andrew Melnik
Actions from a selected similar situation can be performed by the agent until representations of the agent's current situation and the selected experience diverge in the latent space.
no code implementations • 23 Mar 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah
To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022.
no code implementations • 27 Dec 2022 • Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik
Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.