Search Results for author: Stephanie Milani

Found 18 papers, 3 papers with code

BEDD: The MineRL BASALT Evaluation and Demonstrations Dataset for Training and Benchmarking Agents that Solve Fuzzy Tasks

1 code implementation • NeurIPS 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Rohin Shah

Given the completion of two years of BASALT competitions, we offer to the community a formalized benchmark through the BASALT Evaluation and Demonstrations Dataset (BEDD), which serves as a resource for algorithm development and performance assessment.

Benchmarking

Paper
Code

MABL: Bi-Level Latent-Variable World Model for Sample-Efficient Multi-Agent Reinforcement Learning

no code implementations • 12 Apr 2023 • Aravind Venugopal, Stephanie Milani, Fei Fang, Balaraman Ravindran

Unlike existing models, MABL is capable of encoding essential global information into the latent states during training while guaranteeing the decentralized execution of learned policies.

reinforcement-learning SMAC+

Paper
Add Code

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

no code implementations • 23 Mar 2023 • Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah

To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022.

Paper
Add Code

Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games

no code implementations • 2 Mar 2023 • Stephanie Milani, Arthur Juliani, Ida Momennejad, Raluca Georgescu, Jaroslaw Rzpecki, Alison Shaw, Gavin Costello, Fei Fang, Sam Devlin, Katja Hofmann

We aim to understand how people assess human likeness in navigation produced by people and artificially intelligent (AI) agents in a video game.

Paper
Add Code

UniMASK: Unified Inference in Sequential Decision Problems

1 code implementation • 20 Nov 2022 • Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

Decision Making

Paper
Code

MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent Reinforcement Learning

no code implementations • 25 May 2022 • Stephanie Milani, Zhicheng Zhang, Nicholay Topin, Zheyuan Ryan Shi, Charles Kamhoua, Evangelos E. Papalexakis, Fei Fang

The first algorithm, IVIPER, extends VIPER, a recent method for single-agent interpretable RL, to the multi-agent setting.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

no code implementations • 28 Apr 2022 • Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

Decision Making Offline RL

Paper
Add Code

Retrospective on the 2021 BASALT Competition on Learning from Human Feedback

no code implementations • 14 Apr 2022 • Rohin Shah, Steven H. Wang, Cody Wild, Stephanie Milani, Anssi Kanervisto, Vinicius G. Goecks, Nicholas Waytowich, David Watkins-Valls, Bharat Prakash, Edmund Mills, Divyansh Garg, Alexander Fries, Alexandra Souly, Chan Jun Shern, Daniel del Castillo, Tom Lieberum

The goal of the competition was to promote research towards agents that use learning from human feedback (LfHF) techniques to solve open-world tasks.

Paper
Add Code

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

no code implementations • 17 Feb 2022 • Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

With this in mind, we hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.

Paper
Add Code

A Survey of Explainable Reinforcement Learning

no code implementations • 17 Feb 2022 • Stephanie Milani, Nicholay Topin, Manuela Veloso, Fei Fang

In this survey, we propose a novel taxonomy for organizing the XRL literature that prioritizes the RL setting.

Decision Making reinforcement-learning +1

Paper
Add Code

The MineRL BASALT Competition on Learning from Human Feedback

no code implementations • 5 Jul 2021 • Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

Rather than training AI systems using a predefined reward function or using a labeled dataset with a predefined set of categories, we instead train the AI system using a learning signal derived from some form of human feedback, which can evolve over time as the understanding of the task changes, or as the capabilities of the AI system improve.

Imitation Learning

Paper
Add Code

Towards robust and domain agnostic reinforcement learning competitions

no code implementations • 7 Jun 2021 • William Hebgen Guss, Stephanie Milani, Nicholay Topin, Brandon Houghton, Sharada Mohanty, Andrew Melnik, Augustin Harter, Benoit Buschmaas, Bjarne Jaster, Christoph Berganski, Dennis Heitkamp, Marko Henning, Helge Ritter, Chengjie WU, Xiaotian Hao, Yiming Lu, Hangyu Mao, Yihuan Mao, Chao Wang, Michal Opanowicz, Anssi Kanervisto, Yanick Schraner, Christian Scheller, Xiren Zhou, Lu Liu, Daichi Nishio, Toi Tsuneda, Karolis Ramanauskas, Gabija Juceviciute

Reinforcement learning competitions have formed the basis for standard research benchmarks, galvanized advances in the state-of-the-art, and shaped the direction of the field.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods

no code implementations • 25 Feb 2021 • Nicholay Topin, Stephanie Milani, Fei Fang, Manuela Veloso

Because of this decision tree equivalence, any function approximator can be used during training, including a neural network, while yielding a decision tree policy for the base MDP.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors

no code implementations • 26 Jan 2021 • William H. Guss, Mario Ynocente Castro, Sam Devlin, Brandon Houghton, Noboru Sean Kuno, Crissman Loomis, Stephanie Milani, Sharada Mohanty, Keisuke Nakata, Ruslan Salakhutdinov, John Schulman, Shinya Shiroshita, Nicholay Topin, Avinash Ummadisingu, Oriol Vinyals

Although deep reinforcement learning has led to breakthroughs in many difficult domains, these successes have required an ever-increasing number of samples, affording only a shrinking segment of the AI community access to their development.

Decision Making Efficient Exploration +2

Paper
Add Code

Guaranteeing Reproducibility in Deep Learning Competitions

no code implementations • 12 May 2020 • Brandon Houghton, Stephanie Milani, Nicholay Topin, William Guss, Katja Hofmann, Diego Perez-Liebana, Manuela Veloso, Ruslan Salakhutdinov

To encourage the development of methods with reproducible and robust training behavior, we propose a challenge paradigm where competitors are evaluated directly on the performance of their learning procedures rather than pre-trained agents.

Paper
Add Code

Retrospective Analysis of the 2019 MineRL Competition on Sample Efficient Reinforcement Learning

no code implementations • 10 Mar 2020 • Stephanie Milani, Nicholay Topin, Brandon Houghton, William H. Guss, Sharada P. Mohanty, Keisuke Nakata, Oriol Vinyals, Noboru Sean Kuno

To facilitate research in the direction of sample efficient reinforcement learning, we held the MineRL Competition on Sample Efficient Reinforcement Learning Using Human Priors at the Thirty-third Conference on Neural Information Processing Systems (NeurIPS 2019).

Imitation Learning reinforcement-learning +1

Paper
Add Code

Planning with Abstract Learned Models While Learning Transferable Subtasks

no code implementations • 16 Dec 2019 • John Winder, Stephanie Milani, Matthew Landen, Erebus Oh, Shane Parr, Shawn Squire, Marie desJardins, Cynthia Matuszek

We introduce an algorithm for model-based hierarchical reinforcement learning to acquire self-contained transition and reward models suitable for probabilistic planning at multiple levels of abstraction.

Hierarchical Reinforcement Learning reinforcement-learning +1

Paper
Add Code

The MineRL 2019 Competition on Sample Efficient Reinforcement Learning using Human Priors

1 code implementation • 22 Apr 2019 • William H. Guss, Cayden Codel, Katja Hofmann, Brandon Houghton, Noboru Kuno, Stephanie Milani, Sharada Mohanty, Diego Perez Liebana, Ruslan Salakhutdinov, Nicholay Topin, Manuela Veloso, Phillip Wang

To that end, we introduce: (1) the Minecraft ObtainDiamond task, a sequential decision making environment requiring long-term planning, hierarchical control, and efficient exploration methods; and (2) the MineRL-v0 dataset, a large-scale collection of over 60 million state-action pairs of human demonstrations that can be resimulated into embodied trajectories with arbitrary modifications to game state and visuals.

Decision Making Efficient Exploration +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.