no code implementations • 16 Apr 2024 • Kyle Hsu, Jubayer Ibn Hamid, Kaylee Burns, Chelsea Finn, Jiajun Wu
Inductive biases are crucial in disentangled representation learning for narrowing down an underspecified solution set.
no code implementations • 28 Mar 2024 • Ryan Park, Rafael Rafailov, Stefano Ermon, Chelsea Finn
A number of approaches have been developed to control those biases in the classical RLHF literature, but the problem remains relatively under-explored for Direct Alignment Algorithms such as Direct Preference Optimization (DPO).
no code implementations • 19 Mar 2024 • Lucy Xiaoyang Shi, Zheyuan Hu, Tony Z. Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, Chelsea Finn
In this paper, we make the following observation: high-level policies that index into sufficiently rich and expressive low-level language-conditioned skills can be readily supervised with human feedback in the form of language corrections.
no code implementations • 8 Mar 2024 • Jensen Gao, Annie Xie, Ted Xiao, Chelsea Finn, Dorsa Sadigh
Recent works on large-scale robotic data collection typically vary a wide range of environmental factors during data collection, such as object types and table textures.
1 code implementation • 22 Feb 2024 • Johnathan Xie, Yoonho Lee, Annie S. Chen, Chelsea Finn
Self-supervised learning excels in learning representations from large amounts of unlabeled data, demonstrating success across multiple data modalities.
1 code implementation • 19 Feb 2024 • Archit Sharma, Sedrick Keh, Eric Mitchell, Chelsea Finn, Kushal Arora, Thomas Kollar
RLAIF first performs supervised fine-tuning (SFT) using demonstrations from a teacher model and then further fine-tunes the model with reinforcement learning (RL), using feedback from a critic model.
1 code implementation • 18 Feb 2024 • Yiyang Zhou, Chenhang Cui, Rafael Rafailov, Chelsea Finn, Huaxiu Yao
This procedure is not perfect and can cause the model to hallucinate - provide answers that do not accurately reflect the image, even when the core LLM is highly factual and the vision backbone has sufficiently complete representations.
1 code implementation • 16 Feb 2024 • Moritz Stephan, Alexander Khazatsky, Eric Mitchell, Annie S Chen, Sheryl Hsu, Archit Sharma, Chelsea Finn
The diversity of contexts in which large language models (LLMs) are deployed requires the ability to modify or customize default model behaviors to incorporate nuanced requirements and preferences.
no code implementations • 12 Feb 2024 • Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter
In each iteration, the image is annotated with a visual representation of proposals that the VLM can refer to (e. g., candidate robot actions, localizations, or trajectories).
1 code implementation • 7 Feb 2024 • Allan Zhou, Chelsea Finn, James Harrison
A challenging problem in many modern machine learning tasks is to process weight-space features, i. e., to transform or extract information from the weights and gradients of a neural network.
no code implementations • 6 Feb 2024 • Yoonho Lee, Michelle S. Lam, Helena Vasconcelos, Michael S. Bernstein, Chelsea Finn
Additionally, we use Clarify to find and rectify 31 novel hard subpopulations in the ImageNet dataset, improving minority-split accuracy from 21. 1% to 28. 7%.
no code implementations • 29 Jan 2024 • Jianlan Luo, Zheyuan Hu, Charles Xu, You Liang Tan, Jacob Berg, Archit Sharma, Stefan Schaal, Chelsea Finn, Abhishek Gupta, Sergey Levine
We posit that a significant challenge to widespread adoption of robotic RL, as well as further development of robotic RL methods, is the comparative inaccessibility of such methods.
no code implementations • 23 Jan 2024 • Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao, Peng Xu, Steve Xu, Zhuo Xu
We experimentally show that such "in-the-wild" data collected by AutoRT is significantly more diverse, and that AutoRT's use of LLMs allows for instruction following data collection robots that can align to human preferences.
no code implementations • 18 Jan 2024 • Caroline Choi, Yoonho Lee, Annie Chen, Allan Zhou, aditi raghunathan, Chelsea Finn
Given a task, AutoFT searches for a fine-tuning procedure that enhances out-of-distribution (OOD) generalization.
no code implementations • 6 Jan 2024 • Rafael Rafailov, Kyle Hatch, Victor Kolev, John D. Martin, Mariano Phielipp, Chelsea Finn
We study the problem of offline pre-training and online fine-tuning for reinforcement learning from high-dimensional observations in the context of realistic robot tasks.
no code implementations • 4 Jan 2024 • Zipeng Fu, Tony Z. Zhao, Chelsea Finn
We first present Mobile ALOHA, a low-cost and whole-body teleoperation system for data collection.
no code implementations • 14 Nov 2023 • Katherine Tian, Eric Mitchell, Huaxiu Yao, Christopher D. Manning, Chelsea Finn
The fluency and creativity of large pre-trained language models (LLMs) have led to their widespread use, sometimes even as a replacement for traditional search engines.
no code implementations • 3 Nov 2023 • Jiayuan Gu, Sean Kirmani, Paul Wohlhart, Yao Lu, Montserrat Gonzalez Arenas, Kanishka Rao, Wenhao Yu, Chuyuan Fu, Keerthana Gopalakrishnan, Zhuo Xu, Priya Sundaresan, Peng Xu, Hao Su, Karol Hausman, Chelsea Finn, Quan Vuong, Ted Xiao
Generalization remains one of the most important desiderata for robust robot learning systems.
no code implementations • 3 Nov 2023 • Kaylee Burns, Zach Witzel, Jubayer Ibn Hamid, Tianhe Yu, Chelsea Finn, Karol Hausman
Inspired by the success of transfer learning in computer vision, roboticists have investigated visual pre-training as a means to improve the learning efficiency and generalization ability of policies learned from pixels.
no code implementations • 2 Nov 2023 • Annie S. Chen, Govind Chada, Laura Smith, Archit Sharma, Zipeng Fu, Sergey Levine, Chelsea Finn
We provide theoretical analysis of our selection mechanism and demonstrate that ROAM enables a robot to adapt rapidly to changes in dynamics both in simulation and on a real Go1 quadruped, even successfully moving forward with roller skates on its feet.
no code implementations • 23 Oct 2023 • Jingyun Yang, Max Sobol Mark, Brandon Vu, Archit Sharma, Jeannette Bohg, Chelsea Finn
We aim to enable this paradigm in robotic reinforcement learning, allowing a robot to learn a new task with little human effort by leveraging data and models from the Internet.
1 code implementation • 20 Oct 2023 • Joey Hejna, Rafael Rafailov, Harshit Sikchi, Chelsea Finn, Scott Niekum, W. Bradley Knox, Dorsa Sadigh
Thus, learning a reward function from feedback is not only based on a flawed assumption of human preference, but also leads to unwieldy optimization challenges that stem from policy gradients or bootstrapping in the RL phase.
1 code implementation • 19 Oct 2023 • Eric Mitchell, Rafael Rafailov, Archit Sharma, Chelsea Finn, Christopher D. Manning
To aid in doing so, we introduce a novel technique for decoupling the knowledge and skills gained in these two stages, enabling a direct answer to the question, "What would happen if we combined the knowledge learned by a large model during pre-training with the knowledge learned by a small model during fine-tuning (or vice versa)?"
1 code implementation • 12 Oct 2023 • Max Sobol Mark, Archit Sharma, Fahim Tajwar, Rafael Rafailov, Sergey Levine, Chelsea Finn
Can we leverage offline RL to recover better policies from online interaction?
1 code implementation • 1 Oct 2023 • Yiyang Zhou, Chenhang Cui, Jaehong Yoon, Linjun Zhang, Zhun Deng, Chelsea Finn, Mohit Bansal, Huaxiu Yao
Large vision-language models (LVLMs) have shown remarkable abilities in understanding visual information with human languages.
no code implementations • 18 Sep 2023 • Yevgen Chebotar, Quan Vuong, Alex Irpan, Karol Hausman, Fei Xia, Yao Lu, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Sontakke, Grecia Salazar, Huong T Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singht, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine
In this work, we present a scalable reinforcement learning method for training multi-task policies from large offline datasets that can leverage both human demonstrations and autonomously collected data.
no code implementations • 11 Sep 2023 • Ziwen Zhuang, Zipeng Fu, Jianren Wang, Christopher Atkeson, Soeren Schwertfeger, Chelsea Finn, Hang Zhao
Parkour is a grand challenge for legged locomotion that requires robots to overcome various obstacles rapidly in complex environments.
1 code implementation • 24 Aug 2023 • Homer Walke, Kevin Black, Abraham Lee, Moo Jin Kim, Max Du, Chongyi Zheng, Tony Zhao, Philippe Hansen-Estruch, Quan Vuong, Andre He, Vivek Myers, Kuan Fang, Chelsea Finn, Sergey Levine
By publicly sharing BridgeData V2 and our pre-trained models, we aim to accelerate research in scalable robot learning methods.
1 code implementation • 28 Jul 2023 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Xi Chen, Krzysztof Choromanski, Tianli Ding, Danny Driess, Avinava Dubey, Chelsea Finn, Pete Florence, Chuyuan Fu, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Kehang Han, Karol Hausman, Alexander Herzog, Jasmine Hsu, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Lisa Lee, Tsang-Wei Edward Lee, Sergey Levine, Yao Lu, Henryk Michalewski, Igor Mordatch, Karl Pertsch, Kanishka Rao, Krista Reymann, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Pierre Sermanet, Jaspiar Singh, Anikait Singh, Radu Soricut, Huong Tran, Vincent Vanhoucke, Quan Vuong, Ayzaan Wahid, Stefan Welker, Paul Wohlhart, Jialin Wu, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich
Our goal is to enable a single end-to-end trained model to both learn to map robot observations to actions and enjoy the benefits of large-scale pretraining on language and vision-language data from the web.
no code implementations • 26 Jul 2023 • Lucy Xiaoyang Shi, Archit Sharma, Tony Z. Zhao, Chelsea Finn
AWE can be combined with any BC algorithm, and we find that AWE can increase the success rate of state-of-the-art algorithms by up to 25% in simulation and by 4-28% on real-world bimanual manipulation tasks, reducing the decision making horizon by up to a factor of 10.
1 code implementation • 24 Jul 2023 • Kyle Hatch, Benjamin Eysenbach, Rafael Rafailov, Tianhe Yu, Ruslan Salakhutdinov, Sergey Levine, Chelsea Finn
In this paper, we propose a method for offline, example-based control that learns an implicit model of multi-step transitions, rather than a reward function.
no code implementations • 12 Jul 2023 • Moo Jin Kim, Jiajun Wu, Chelsea Finn
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
1 code implementation • 11 Jul 2023 • Lukas Haas, Michal Skreta, Silas Alberti, Chelsea Finn
We train two models for evaluations on street-level data and general-purpose image geolocalization; the first model, PIGEON, is trained on data from the game of Geoguessr and is capable of placing over 40% of its guesses within 25 kilometers of the target location globally.
Ranked #1 on Photo geolocation estimation on YFCC4k
no code implementations • 7 Jul 2023 • Annie Xie, Lisa Lee, Ted Xiao, Chelsea Finn
Towards an answer to this question, we study imitation learning policies in simulation and on a real robot language-conditioned manipulation task to quantify the difficulty of generalization to different (sets of) factors.
no code implementations • 7 Jul 2023 • Jonathan Yang, Dorsa Sadigh, Chelsea Finn
Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets.
no code implementations • 19 Jun 2023 • Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn
Effective machine learning models learn both robust features that directly determine the outcome of interest (e. g., an object with wheels is more likely to be a car), and shortcut features (e. g., an object on a road is more likely to be a car).
no code implementations • 14 Jun 2023 • Evan Zheran Liu, Sahaana Suri, Tong Mu, Allan Zhou, Chelsea Finn
Specifically, we design an office navigation environment, where the agent's goal is to find a particular office, and office locations differ in different buildings (i. e., tasks).
1 code implementation • 8 Jun 2023 • Caroline Choi, Fahim Tajwar, Yoonho Lee, Huaxiu Yao, Ananya Kumar, Chelsea Finn
Taking inspiration from this result, we present data-driven confidence minimization (DCM), which minimizes confidence on an uncertainty dataset containing examples that the model is likely to misclassify at test time.
1 code implementation • 1 Jun 2023 • Gaoyue Zhou, Victoria Dean, Mohan Kumar Srirama, Aravind Rajeswaran, Jyothish Pari, Kyle Hatch, Aryan Jain, Tianhe Yu, Pieter Abbeel, Lerrel Pinto, Chelsea Finn, Abhinav Gupta
Three challenges limit the progress of robot learning research: robots are expensive (few labs can participate), everyone uses different robots (findings do not generalize across labs), and we lack internet-scale robotics data.
12 code implementations • NeurIPS 2023 • Rafael Rafailov, Archit Sharma, Eric Mitchell, Stefano Ermon, Christopher D. Manning, Chelsea Finn
Existing methods for gaining such steerability collect human labels of the relative quality of model generations and fine-tune the unsupervised LM to align with these preferences, often with reinforcement learning from human feedback (RLHF).
1 code implementation • NeurIPS 2023 • Kyle Hsu, Will Dorrell, James C. R. Whittington, Jiajun Wu, Chelsea Finn
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
1 code implementation • 24 May 2023 • Nathan Hu, Eric Mitchell, Christopher D. Manning, Chelsea Finn
We meta-train a small, autoregressive model to reweight the language modeling loss for each token during online fine-tuning, with the objective of maximizing the out-of-date base question-answering model's ability to answer questions about a document after a single weighted gradient step.
no code implementations • 24 May 2023 • Katherine Tian, Eric Mitchell, Allan Zhou, Archit Sharma, Rafael Rafailov, Huaxiu Yao, Chelsea Finn, Christopher D. Manning
A trustworthy real-world prediction system should produce well-calibrated confidence scores; that is, its confidence in an answer should be indicative of the likelihood that the answer is correct, enabling deferral to an expert in cases of low-confidence predictions.
1 code implementation • 26 Apr 2023 • Stephen Tian, Chelsea Finn, Jiajun Wu
Video is a promising source of knowledge for embodied agents to learn models of the world's dynamics.
no code implementations • 23 Apr 2023 • Tony Z. Zhao, Vikash Kumar, Sergey Levine, Chelsea Finn
Fine manipulation tasks, such as threading cable ties or slotting a battery, are notoriously difficult for robots because they require precision, careful coordination of contact forces, and closed-loop visual feedback.
no code implementations • 18 Apr 2023 • Maximilian Du, Suraj Nair, Dorsa Sadigh, Chelsea Finn
Concretely, we propose a simple approach that uses a small amount of downstream expert data to selectively query relevant behaviors from an offline, unlabeled dataset (including many sub-optimal behaviors).
2 code implementations • NeurIPS 2023 • Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine
Our approach, calibrated Q-learning (Cal-QL), accomplishes this by learning a conservative value function initialization that underestimates the value of the learned policy from offline data, while also being calibrated, in the sense that the learned Q-values are at a reasonable scale.
no code implementations • 2 Mar 2023 • Austin Stone, Ted Xiao, Yao Lu, Keerthana Gopalakrishnan, Kuang-Huei Lee, Quan Vuong, Paul Wohlhart, Sean Kirmani, Brianna Zitkovich, Fei Xia, Chelsea Finn, Karol Hausman
This brings up a notably difficult challenge for robots: while robot learning approaches allow robots to learn many different behaviors from first-hand experience, it is impractical for robots to have first-hand experiences that span all of this semantic information.
no code implementations • 2 Mar 2023 • Archit Sharma, Ahmed M. Ahmed, Rehaan Ahmad, Chelsea Finn
In this work, we propose MEDAL++, a novel design for self-improving robotic systems: given a small set of expert demonstrations at the start, the robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
2 code implementations • 24 Feb 2023 • Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang
First, we demonstrate that existing representations yield inconsistent results across these tasks: masked autoencoding approaches pick up on low-level spatial features at the cost of high-level semantics, while contrastive learning approaches capture the opposite.
no code implementations • 10 Feb 2023 • Annie S. Chen, Yoonho Lee, Amrith Setlur, Sergey Levine, Chelsea Finn
Transfer learning with a small amount of target data is an effective and common approach to adapting a pre-trained model to distribution shifts.
no code implementations • 6 Feb 2023 • Huaxiu Yao, Xinyu Yang, Xinyi Pan, Shengchao Liu, Pang Wei Koh, Chelsea Finn
Distribution shift presents a significant challenge in machine learning, where models often underperform during the test stage when faced with a different distribution than the one they were trained on.
1 code implementation • 6 Feb 2023 • Amrith Setlur, Don Dennis, Benjamin Eysenbach, aditi raghunathan, Chelsea Finn, Virginia Smith, Sergey Levine
Some robust training algorithms (e. g., Group DRO) specialize to group shifts and require group information on all training points.
2 code implementations • 26 Jan 2023 • Eric Mitchell, Yoonho Lee, Alexander Khazatsky, Christopher D. Manning, Chelsea Finn
In this paper, we identify a property of the structure of an LLM's probability function that is useful for such detection.
no code implementations • 19 Jan 2023 • Jacob Beck, Risto Vuorio, Evan Zheran Liu, Zheng Xiong, Luisa Zintgraf, Chelsea Finn, Shimon Whiteson
Meta-RL is most commonly studied in a problem setting where, given a distribution of tasks, the goal is to learn a policy that is capable of adapting to any new task from the task distribution with as little data as possible.
no code implementations • CVPR 2023 • Allan Zhou, Moo Jin Kim, Lirui Wang, Pete Florence, Chelsea Finn
Expert demonstrations are a rich source of supervision for training visual robotic manipulation policies, but imitation learning methods often require either a large number of demonstrations or expensive online expert supervision to learn reactive closed-loop behaviors.
1 code implementation • 13 Dec 2022 • Anthony Brohan, Noah Brown, Justice Carbajal, Yevgen Chebotar, Joseph Dabis, Chelsea Finn, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Tomas Jackson, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Isabel Leal, Kuang-Huei Lee, Sergey Levine, Yao Lu, Utsav Malla, Deeksha Manjunath, Igor Mordatch, Ofir Nachum, Carolina Parada, Jodilyn Peralta, Emily Perez, Karl Pertsch, Jornell Quiambao, Kanishka Rao, Michael Ryoo, Grecia Salazar, Pannag Sanketi, Kevin Sayed, Jaspiar Singh, Sumedh Sontakke, Austin Stone, Clayton Tan, Huong Tran, Vincent Vanhoucke, Steve Vega, Quan Vuong, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Tianhe Yu, Brianna Zitkovich
By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance.
1 code implementation • 8 Dec 2022 • Yiding Jiang, Evan Zheran Liu, Benjamin Eysenbach, Zico Kolter, Chelsea Finn
Identifying statistical regularities in solutions to some tasks in multi-task reinforcement learning can accelerate the learning of new tasks.
1 code implementation • 27 Nov 2022 • Peter Henderson, Eric Mitchell, Christopher D. Manning, Dan Jurafsky, Chelsea Finn
A growing ecosystem of large, open-source foundation models has reduced the labeled data and technical expertise necessary to apply machine learning to many new problems.
1 code implementation • 25 Nov 2022 • Huaxiu Yao, Caroline Choi, Bochuan Cao, Yoonho Lee, Pang Wei Koh, Chelsea Finn
Temporal shifts -- distribution shifts arising from the passage of time -- often occur gradually and have the additional structure of timestamp metadata.
no code implementations • 21 Nov 2022 • Eric Mitchell, Joseph J. Noh, Siyan Li, William S. Armstrong, Ananth Agarwal, Patrick Liu, Chelsea Finn, Christopher D. Manning
While large pre-trained language models are powerful, their predictions often lack logical consistency across test inputs.
1 code implementation • 16 Nov 2022 • Evan Zheran Liu, Moritz Stephan, Allen Nie, Chris Piech, Emma Brunskill, Chelsea Finn
However, teaching and giving feedback on such software is time-consuming -- standard approaches require instructors to manually grade student-implemented interactive programs.
1 code implementation • 25 Oct 2022 • Xinyu Yang, Huaxiu Yao, Allan Zhou, Chelsea Finn
We study this multi-domain long-tailed learning problem and aim to produce a model that generalizes well across all classes and domains.
1 code implementation • 20 Oct 2022 • Yoonho Lee, Annie S. Chen, Fahim Tajwar, Ananya Kumar, Huaxiu Yao, Percy Liang, Chelsea Finn
A common approach to transfer learning under distribution shift is to fine-tune the last few layers of a pre-trained model, preserving learned features while also adapting to the new task.
1 code implementation • 19 Oct 2022 • Annie Xie, Fahim Tajwar, Archit Sharma, Chelsea Finn
A long-term goal of reinforcement learning is to design agents that can autonomously interact and learn in the world.
no code implementations • 17 Oct 2022 • Annie S. Chen, Archit Sharma, Sergey Levine, Chelsea Finn
We formalize this problem setting, which we call single-life reinforcement learning (SLRL), where an agent must complete a task within a single episode without interventions, utilizing its prior experience while contending with some form of novelty.
1 code implementation • 11 Oct 2022 • Aviral Kumar, Anikait Singh, Frederik Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine
To our knowledge, PTR is the first RL method that succeeds at learning new tasks in a new domain on a real WidowX robot with as few as 10 task demonstrations, by effectively leveraging an existing dataset of diverse multi-task robot data collected in a variety of toy kitchens.
no code implementations • 11 Oct 2022 • Zhenbang Wu, Huaxiu Yao, Zhe Su, David M Liebovitz, Lucas M Glass, James Zou, Chelsea Finn, Jimeng Sun
However, newly approved drugs do not have much historical prescription data and cannot leverage existing drug recommendation methods.
1 code implementation • 11 Oct 2022 • Huaxiu Yao, Yiping Wang, Linjun Zhang, James Zou, Chelsea Finn
In this paper, we propose a simple yet powerful algorithm, C-Mixup, to improve generalization on regression tasks.
no code implementations • 6 Oct 2022 • Andrew J. Nam, Mengye Ren, Chelsea Finn, James L. McClelland
Large language models have recently shown promising progress in mathematical reasoning when fine-tuned with human-generated sequences walking through a sequence of solution steps.
no code implementations • 26 Jul 2022 • Kaylee Burns, Tianhe Yu, Chelsea Finn, Karol Hausman
In this paper, we focus on one particular aspect of heterogeneity: learning from offline data collected at different control frequencies.
1 code implementation • 13 Jun 2022 • Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn
We find that only SERAC achieves high performance on all three problems, consistently outperforming existing approaches to model editing by a significant margin.
no code implementations • 30 May 2022 • Maximilian Du, Olivia Y. Lee, Suraj Nair, Chelsea Finn
In a set of simulated tasks, we find that our system benefits from using audio, and that by using online interventions we are able to improve the success rate of offline imitation learning by ~20%.
1 code implementation • 11 May 2022 • Archit Sharma, Rehaan Ahmad, Chelsea Finn
Prior works have considered an alternating approach where a forward policy learns to solve the task and the backward policy learns to reset the environment, but what initial state distribution should the backward policy reset the agent to?
no code implementations • 18 Apr 2022 • Ali Ghadirzadeh, Petra Poklukar, Karol Arndt, Chelsea Finn, Ville Kyrki, Danica Kragic, Mårten Björkman
We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models.
3 code implementations • 4 Apr 2022 • Michael Ahn, Anthony Brohan, Noah Brown, Yevgen Chebotar, Omar Cortes, Byron David, Chelsea Finn, Chuyuan Fu, Keerthana Gopalakrishnan, Karol Hausman, Alex Herzog, Daniel Ho, Jasmine Hsu, Julian Ibarz, Brian Ichter, Alex Irpan, Eric Jang, Rosario Jauregui Ruano, Kyle Jeffrey, Sally Jesmonth, Nikhil J Joshi, Ryan Julian, Dmitry Kalashnikov, Yuheng Kuang, Kuang-Huei Lee, Sergey Levine, Yao Lu, Linda Luu, Carolina Parada, Peter Pastor, Jornell Quiambao, Kanishka Rao, Jarek Rettinghouse, Diego Reyes, Pierre Sermanet, Nicolas Sievers, Clayton Tan, Alexander Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Sichun Xu, Mengyuan Yan, Andy Zeng
We show how low-level skills can be combined with large language models so that the language model provides high-level knowledge about the procedures for performing complex and temporally-extended instructions, while value functions associated with these skills provide the grounding necessary to connect this knowledge to a particular physical environment.
1 code implementation • 23 Mar 2022 • Suraj Nair, Aravind Rajeswaran, Vikash Kumar, Chelsea Finn, Abhinav Gupta
We study how visual representations pre-trained on diverse human video data can enable data-efficient learning of downstream robotic manipulation tasks.
1 code implementation • ICLR 2022 • Allan Zhou, Fahim Tajwar, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn
Based on this analysis, we show how a generative approach for learning the nuisance transformations can help transfer invariances across classes and improve performance on a set of imbalanced image classification benchmarks.
Ranked #21 on Long-tail Learning on CIFAR-10-LT (ρ=100)
1 code implementation • 16 Mar 2022 • Xi Chen, Ali Ghadirzadeh, Tianhe Yu, Yuan Gao, Jianhao Wang, Wenzhe Li, Bin Liang, Chelsea Finn, Chongjie Zhang
Offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions.
no code implementations • ICLR 2022 • Kyle Hsu, Moo Jin Kim, Rafael Rafailov, Jiajun Wu, Chelsea Finn
We study how the choice of visual perspective affects learning and generalization in the context of physical manipulation from raw sensor observations.
no code implementations • 10 Mar 2022 • Allan Zhou, Vikash Kumar, Chelsea Finn, Aravind Rajeswaran
Many tasks in control, robotics, and planning can be specified using desired goal configurations for various entities in the environment.
1 code implementation • 3 Mar 2022 • Michael Zhang, Nimit S. Sohoni, Hongyang R. Zhang, Chelsea Finn, Christopher Ré
As ERM models can be good spurious attribute predictors, CNC works by (1) using a trained ERM model's outputs to identify samples with the same class but dissimilar spurious features, and (2) training a robust model with contrastive learning to learn similar representations for same-class samples.
no code implementations • 14 Feb 2022 • Annie Xie, Shagun Sodhani, Chelsea Finn, Joelle Pineau, Amy Zhang
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments.
1 code implementation • 7 Feb 2022 • Yoonho Lee, Huaxiu Yao, Chelsea Finn
Many datasets are underspecified: there exist multiple equally viable solutions to a given task.
no code implementations • 4 Feb 2022 • Eric Jang, Alex Irpan, Mohi Khansari, Daniel Kappler, Frederik Ebert, Corey Lynch, Sergey Levine, Chelsea Finn
In this paper, we study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks, a long-standing challenge in robot learning.
no code implementations • 3 Feb 2022 • Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Chelsea Finn, Sergey Levine
One natural solution is to learn a reward function from the labeled data and use it to label the unlabeled data.
no code implementations • 1 Feb 2022 • Jathushan Rajasegaran, Chelsea Finn, Sergey Levine
In this paper, we study how meta-learning can be applied to tackle online problems of this nature, simultaneously adapting to changing tasks and input distributions and meta-training the model in order to adapt more quickly in the future.
2 code implementations • 2 Jan 2022 • Huaxiu Yao, Yu Wang, Sai Li, Linjun Zhang, Weixin Liang, James Zou, Chelsea Finn
Machine learning algorithms typically assume that training and test examples are drawn from the same distribution.
2 code implementations • ICLR 2022 • Archit Sharma, Kelvin Xu, Nikhil Sardana, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn
In this paper, we aim to address this discrepancy by laying out a framework for Autonomous Reinforcement Learning (ARL): reinforcement learning where the agent not only learns through its own experience, but also contends with lack of human supervision to reset between trials.
1 code implementation • ICLR 2022 • Shiori Sagawa, Pang Wei Koh, Tony Lee, Irena Gao, Sang Michael Xie, Kendrick Shen, Ananya Kumar, Weihua Hu, Michihiro Yasunaga, Henrik Marklund, Sara Beery, Etienne David, Ian Stavness, Wei Guo, Jure Leskovec, Kate Saenko, Tatsunori Hashimoto, Sergey Levine, Chelsea Finn, Percy Liang
Unlabeled data can be a powerful point of leverage for mitigating these distribution shifts, as it is frequently much more available than labeled data and can often be obtained from distributions beyond the source distribution as well.
no code implementations • ICLR 2022 • Glen Berseth, Zhiwei Zhang, Grace Zhang, Chelsea Finn, Sergey Levine
Beyond simply transferring past experience to new tasks, our goal is to devise continual reinforcement learning algorithms that learn to learn, using their experience on previous tasks to learn new tasks more quickly.
no code implementations • 7 Dec 2021 • Michael Luo, Ashwin Balakrishna, Brijen Thananjeyan, Suraj Nair, Julian Ibarz, Jie Tan, Chelsea Finn, Ion Stoica, Ken Goldberg
Safe exploration is critical for using reinforcement learning (RL) in risk-sensitive environments.
no code implementations • NeurIPS 2021 • Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D. Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine
We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
no code implementations • NeurIPS 2021 • Ferran Alet, Dylan Doblar, Allan Zhou, Joshua Tenenbaum, Kenji Kawaguchi, Chelsea Finn
Progress in machine learning (ML) stems from a combination of data availability, computational resources, and an appropriate encoding of inductive biases.
2 code implementations • NeurIPS 2021 • Huaxiu Yao, Yu Wang, Ying WEI, Peilin Zhao, Mehrdad Mahdavi, Defu Lian, Chelsea Finn
In ATS, for the first time, we design a neural scheduler to decide which meta-training tasks to use next by predicting the probability being sampled for each candidate task, and train the scheduler to optimize the generalization capacity of the meta-model to unseen tasks.
3 code implementations • ICLR 2022 • Eric Mitchell, Charles Lin, Antoine Bosselut, Chelsea Finn, Christopher D. Manning
To enable easy post-hoc editing at scale, we propose Model Editor Networks using Gradient Decomposition (MEND), a collection of small auxiliary editing networks that use a single desired input-output pair to make fast, local edits to a pre-trained model's behavior.
2 code implementations • 18 Oct 2021 • Marvin Zhang, Sergey Levine, Chelsea Finn
We study the problem of test time robustification, i. e., using the test input to improve model robustness.
no code implementations • 29 Sep 2021 • Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Chelsea Finn, Sergey Levine, Karol Hausman
However, these benefits come at a cost -- for data to be shared between tasks, each transition must be annotated with reward labels corresponding to other tasks.
no code implementations • 29 Sep 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan
Furthermore, such an agent can internally represent the complex dynamics of the real-world and therefore can acquire a representation useful for a variety of visual perception tasks.
no code implementations • 29 Sep 2021 • Marvin Mengxin Zhang, Sergey Levine, Chelsea Finn
We study the problem of test time robustification, i. e., using the test input to improve model robustness.
2 code implementations • 27 Sep 2021 • Frederik Ebert, Yanlai Yang, Karl Schmeckpeper, Bernadette Bucher, Georgios Georgakis, Kostas Daniilidis, Chelsea Finn, Sergey Levine
Robot learning holds the promise of learning policies that generalize broadly.
1 code implementation • 22 Sep 2021 • Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine
To this end, we devise a set of metrics and conditions that can be tracked over the course of offline training, and can inform the practitioner about how the algorithm and model architecture should be adjusted to improve final performance.
no code implementations • 21 Sep 2021 • Bohan Wu, Suraj Nair, Li Fei-Fei, Chelsea Finn
In this paper, we study the problem of learning a repertoire of low-level skills from raw images that can be sequenced to complete long-horizon visuomotor tasks.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 19 Sep 2021 • Annie Xie, Chelsea Finn
Multi-task learning ideally allows robots to acquire a diverse repertoire of useful skills.
no code implementations • NeurIPS 2021 • Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Sergey Levine, Chelsea Finn
We argue that a natural use case of offline RL is in settings where we can pool large amounts of data collected in various scenarios for solving different tasks, and utilize all of this data to learn behaviors for all the tasks more effectively rather than training each one in isolation.
1 code implementation • NeurIPS 2021 • Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
no code implementations • 2 Sep 2021 • Suraj Nair, Eric Mitchell, Kevin Chen, Brian Ichter, Silvio Savarese, Chelsea Finn
However, goal images also have a number of drawbacks: they are inconvenient for humans to provide, they can over-specify the desired behavior leading to a sparse reward signal, or under-specify task information in the case of non-goal reaching tasks.
2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang
AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
no code implementations • 29 Jul 2021 • John Willes, James Harrison, Ali Harakeh, Chelsea Finn, Marco Pavone, Steven Waslander
As autonomous decision-making agents move from narrow operating environments to unstructured worlds, learning systems must move from a closed-world formulation to an open-world and few-shot setting in which agents continuously learn new classes from small amounts of information.
no code implementations • NeurIPS 2021 • Archit Sharma, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents.
1 code implementation • 23 Jul 2021 • Mike Wu, Noah Goodman, Chris Piech, Chelsea Finn
High-quality computer science education is limited by the difficulty of providing instructor feedback to students at scale.
no code implementations • NeurIPS 2021 • Guodong Zhang, Kyle Hsu, Jianing Li, Chelsea Finn, Roger Grosse
To this end, we propose Differentiable AIS (DAIS), a variant of AIS which ensures differentiability by abandoning the Metropolis-Hastings corrections.
1 code implementation • 19 Jul 2021 • Evan Zheran Liu, Behzad Haghgoo, Annie S. Chen, aditi raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, Chelsea Finn
Standard training via empirical risk minimization (ERM) can produce models that achieve high accuracy on average but low accuracy on certain groups, especially in the presence of spurious correlations between the input and label.
Ranked #1 on Out-of-Distribution Generalization on ImageNet-W
no code implementations • NeurIPS 2021 • Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
We consider a setting where an agent is provided a fixed dataset of visual demonstrations illustrating how to perform a task, and must learn to solve the task using the provided demonstrations and unsupervised environment interactions.
1 code implementation • 24 Jun 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Suraj Nair, Sergey Levine, Chelsea Finn, Dumitru Erhan
There is a growing body of evidence that underfitting on the training data is one of the primary causes for the low quality predictions.
Ranked #6 on Video Generation on BAIR Robot Pushing
no code implementations • ICML Workshop URL 2021 • Nicholas Rhinehart, Jenny Wang, Glen Berseth, John D Co-Reyes, Danijar Hafner, Chelsea Finn, Sergey Levine
We study this question in dynamic partially-observed environments, and argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
1 code implementation • ICLR 2022 • Huaxiu Yao, Linjun Zhang, Chelsea Finn
Meta-learning enables algorithms to quickly learn a newly encountered task with just a few labeled examples by transferring previously learned knowledge.
no code implementations • 16 Apr 2021 • Dmitry Kalashnikov, Jacob Varley, Yevgen Chebotar, Benjamin Swanson, Rico Jonschkowski, Chelsea Finn, Sergey Levine, Karol Hausman
In this paper, we study how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously, sharing exploration, experience, and representations across tasks.
no code implementations • 15 Apr 2021 • Yevgen Chebotar, Karol Hausman, Yao Lu, Ted Xiao, Dmitry Kalashnikov, Jake Varley, Alex Irpan, Benjamin Eysenbach, Ryan Julian, Chelsea Finn, Sergey Levine
We consider the problem of learning useful robotic skills from previously collected offline data without access to manually specified rewards or additional online exploration, a setting that is becoming increasingly important for scaling robot learning by reusing past robotic data.
no code implementations • 31 Mar 2021 • Annie S. Chen, Suraj Nair, Chelsea Finn
We find that by leveraging diverse human datasets, this reward function (a) can generalize zero shot to unseen environments, (b) generalize zero shot to unseen tasks, and (c) can be combined with visual model predictive control to solve robotic manipulation tasks on a real WidowX200 robot in an unseen environment from a single human demo.
no code implementations • 24 Mar 2021 • Behzad Haghgoo, Allan Zhou, Archit Sharma, Chelsea Finn
By planning through a learned dynamics model, model-based reinforcement learning (MBRL) offers the prospect of good performance with little environment interaction.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • ICLR Workshop Learning_to_Learn 2021 • Ali Ghadirzadeh, Petra Poklukar, Xi Chen, Huaxiu Yao, Hossein Azizpour, Mårten Björkman, Chelsea Finn, Danica Kragic
Few-shot meta-learning methods aim to learn the common structure shared across a set of tasks to facilitate learning new tasks with small amounts of data.
no code implementations • ICLR Workshop SSL-RL 2021 • Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
We consider the problem setting of imitation learning where the agent is provided a fixed dataset of demonstrations.
no code implementations • CVPR 2021 • Bohan Wu, Suraj Nair, Roberto Martin-Martin, Li Fei-Fei, Chelsea Finn
Our key insight is that greedy and modular optimization of hierarchical autoencoders can simultaneously address both the memory constraints and the optimization challenges of large-scale video prediction.
Ranked #1 on Video Prediction on Cityscapes 128x128
no code implementations • 5 Mar 2021 • Ali Ghadirzadeh, Xi Chen, Petra Poklukar, Chelsea Finn, Mårten Björkman, Danica Kragic
Our results show that the proposed method can successfully adapt a trained policy to different robotic platforms with novel physical parameters and the superiority of our meta-learning algorithm compared to state-of-the-art methods for the introduced few-shot policy adaptation problem.
4 code implementations • NeurIPS 2021 • Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn
We overcome this limitation by developing a new model-based offline RL algorithm, COMBO, that regularizes the value function on out-of-support state-action tuples generated via rollouts under the learned model.
no code implementations • 4 Feb 2021 • Julian Ibarz, Jie Tan, Chelsea Finn, Mrinal Kalakrishnan, Peter Pastor, Sergey Levine
Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains.
no code implementations • 1 Jan 2021 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Dumitru Erhan, Harini Kannan, Chelsea Finn, Sergey Levine
In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 1 Jan 2021 • Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine
Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.
no code implementations • 1 Jan 2021 • Chris Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
1 code implementation • ICLR 2021 • Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine
In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot.
1 code implementation • 21 Dec 2020 • Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
In this work, we build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
no code implementations • 14 Dec 2020 • Tianhe Yu, Xinyang Geng, Chelsea Finn, Sergey Levine
Few-shot meta-learning methods consider the problem of learning new tasks from a small, fixed number of examples, by meta-learning across static data from a set of previous tasks.
6 code implementations • 14 Dec 2020 • Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton A. Earnshaw, Imran S. Haque, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, Percy Liang
Distribution shifts -- where the training distribution differs from the test distribution -- can substantially degrade the accuracy of machine learning (ML) systems deployed in the wild.
1 code implementation • 8 Dec 2020 • Mohammad Babaeizadeh, Mohammad Taghi Saffar, Danijar Hafner, Harini Kannan, Chelsea Finn, Sergey Levine, Dumitru Erhan
In this paper, we study a number of design decisions for the predictive model in visual MBRL algorithms, focusing specifically on methods that use a predictive model for planning.
Model-based Reinforcement Learning Reinforcement Learning (RL)
no code implementations • NeurIPS 2020 • Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine
First, in real world settings, when an agent attempts a tasks and fails, the environment must somehow "reset" so that the agent can attempt the task again.
no code implementations • 12 Nov 2020 • Annie Xie, Dylan P. Losey, Ryan Tolsma, Chelsea Finn, Dorsa Sadigh
We propose a reinforcement learning-based framework for learning latent representations of an agent's policy, where the ego agent identifies the relationship between its behavior and the other agent's future strategy.
1 code implementation • 12 Nov 2020 • Karl Schmeckpeper, Oleh Rybkin, Kostas Daniilidis, Sergey Levine, Chelsea Finn
In this paper, we consider the question: can we perform reinforcement learning directly on experience collected by humans?
1 code implementation • 10 Nov 2020 • Kelvin Xu, Siddharth Verma, Chelsea Finn, Sergey Levine
Reinforcement learning has the potential to automate the acquisition of behavior in complex settings, but in order for it to be successfully deployed, a number of practical challenges must be addressed.
no code implementations • 29 Oct 2020 • Christopher Fifty, Ehsan Amid, Zhe Zhao, Tianhe Yu, Rohan Anil, Chelsea Finn
Multi-task learning can leverage information learned by one task to benefit the training of other tasks.
2 code implementations • 29 Oct 2020 • Brijen Thananjeyan, Ashwin Balakrishna, Suraj Nair, Michael Luo, Krishnan Srinivasan, Minho Hwang, Joseph E. Gonzalez, Julian Ibarz, Chelsea Finn, Ken Goldberg
Safety remains a central obstacle preventing widespread use of RL in the real world: learning new tasks in uncertain environments requires extensive exploration, but safety requires limiting exploration.
no code implementations • NeurIPS 2020 • Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn
While reinforcement learning algorithms can learn effective policies for complex tasks, these policies are often brittle to even minor task variations, especially when variations are not explicitly provided during training.
no code implementations • 27 Oct 2020 • Krishnan Srinivasan, Benjamin Eysenbach, Sehoon Ha, Jie Tan, Chelsea Finn
Safety is an essential component for deploying reinforcement learning (RL) algorithms in real-world scenarios, and is critical during the learning process itself.
1 code implementation • 26 Oct 2020 • Tony Z. Zhao, Anusha Nagabandi, Kate Rakelly, Chelsea Finn, Sergey Levine
Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.
1 code implementation • 22 Oct 2020 • Annie S. Chen, HyunJi Nam, Suraj Nair, Chelsea Finn
Concretely, we propose an exploration technique, Batch Exploration with Examples (BEE), that explores relevant regions of the state-space, guided by a modest number of human provided images of important states.
no code implementations • 28 Sep 2020 • Marvin Mengxin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
1 code implementation • ICML 2020 • Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment.
2 code implementations • 13 Aug 2020 • Eric Mitchell, Rafael Rafailov, Xue Bin Peng, Sergey Levine, Chelsea Finn
That is, in offline meta-RL, we meta-train on fixed, pre-collected data from several tasks in order to adapt to a new task with a very small amount (less than 5 trajectories) of data from the new task.
2 code implementations • 6 Aug 2020 • Evan Zheran Liu, aditi raghunathan, Percy Liang, Chelsea Finn
Learning a new task often requires both exploring to gather task-relevant information and exploiting this information to solve the task.
no code implementations • ICML 2020 • Suraj Nair, Silvio Savarese, Chelsea Finn
In this paper, we propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space, resulting in a learning objective that more closely matches the downstream task.
3 code implementations • NeurIPS 2021 • Marvin Zhang, Henrik Marklund, Nikita Dhawan, Abhishek Gupta, Sergey Levine, Chelsea Finn
A fundamental assumption of most machine learning algorithms is that the training and test data are drawn from the same underlying distribution.
2 code implementations • ICLR 2021 • Allan Zhou, Tom Knowles, Chelsea Finn
We present a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data.
1 code implementation • NeurIPS 2020 • Karl Pertsch, Oleh Rybkin, Frederik Ebert, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
In this work we propose a framework for visual prediction and planning that is able to overcome both of these limitations.
no code implementations • ICML Workshop LifelongML 2020 • Annie Xie, James Harrison, Chelsea Finn
As humans, our goals and our environment are persistently changing throughout our lifetime based on our experiences, actions, and internal and external drives.
no code implementations • ICML Workshop LifelongML 2020 • Evan Zheran Liu, aditi raghunathan, Percy Liang, Chelsea Finn
In principle, meta-reinforcement learning approaches can exploit this shared structure, but in practice, they fail to adapt to new environments when adaptation requires targeted exploration (e. g., exploring the cabinets to find ingredients in a new kitchen).
no code implementations • ICML Workshop LifelongML 2020 • Ryan Julian, Benjamin Swanson, Gaurav S. Sukhatme, Sergey Levine, Chelsea Finn, Karol Hausman
One of the great promises of robot learning systems is that they will be able to learn from their mistakes and continuously adapt to ever-changing environments, but most robot learning systems today are deployed as fixed policies which do not adapt after deployment.
no code implementations • 12 Jun 2020 • Russell Mendonca, Xinyang Geng, Chelsea Finn, Sergey Levine
Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data, more easily than policies and value functions.
6 code implementations • NeurIPS 2020 • Tianhe Yu, Garrett Thomas, Lantao Yu, Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma
We also characterize the trade-off between the gain and risk of leaving the support of the batch data.
no code implementations • ICLR 2020 • Łukasz Kaiser, Mohammad Babaeizadeh, Piotr Miłos, Błażej Osiński, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.
no code implementations • ICLR 2020 • Allan Zhou, Eric Jang, Daniel Kappler, Alex Herzog, Mohi Khansari, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Sergey Levine, Chelsea Finn
Imitation learning allows agents to learn complex behaviors from demonstrations.
no code implementations • 21 Apr 2020 • Ryan Julian, Benjamin Swanson, Gaurav S. Sukhatme, Sergey Levine, Chelsea Finn, Karol Hausman
One of the great promises of robot learning systems is that they will be able to learn from their mistakes and continuously adapt to ever-changing environments.
no code implementations • NeurIPS 2020 • Lisa Lee, Benjamin Eysenbach, Ruslan Salakhutdinov, Shixiang Shane Gu, Chelsea Finn
Reinforcement learning (RL) is a powerful framework for learning to take actions to solve tasks.
1 code implementation • 16 Mar 2020 • Akhil Padmanabha, Frederik Ebert, Stephen Tian, Roberto Calandra, Chelsea Finn, Sergey Levine
We compare with a state-of-the-art tactile sensor that is only sensitive on one side, as well as a state-of-the-art multi-directional tactile sensor, and find that OmniTact's combination of high-resolution and multi-directional sensing is crucial for reliably inserting the electrical connector and allows for higher accuracy in the state estimation task.
no code implementations • 2 Mar 2020 • Xingyou Song, Yuxiang Yang, Krzysztof Choromanski, Ken Caluwaerts, Wenbo Gao, Chelsea Finn, Jie Tan
Learning adaptable policies is crucial for robots to operate autonomously in our complex and quickly changing world.
no code implementations • 25 Feb 2020 • Avi Singh, Eric Jang, Alexander Irpan, Daniel Kappler, Murtaza Dalal, Sergey Levine, Mohi Khansari, Chelsea Finn
In this work, we target this challenge, aiming to build an imitation learning system that can continuously improve through autonomous data collection, while simultaneously avoiding the explicit use of reinforcement learning, to maintain the stability, simplicity, and scalability of supervised imitation.
9 code implementations • NeurIPS 2020 • Tianhe Yu, Saurabh Kumar, Abhishek Gupta, Sergey Levine, Karol Hausman, Chelsea Finn
While deep learning and deep reinforcement learning (RL) systems have demonstrated impressive results in domains such as image classification, game playing, and robotic control, data efficiency remains a major challenge.
no code implementations • ECCV 2020 • Karl Schmeckpeper, Annie Xie, Oleh Rybkin, Stephen Tian, Kostas Daniilidis, Sergey Levine, Chelsea Finn
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works, and then use this learned model to plan coordinated sequences of actions to bring about desired outcomes.
2 code implementations • NeurIPS 2020 • James Harrison, Apoorva Sharma, Chelsea Finn, Marco Pavone
In this work, we enable the application of generic meta-learning algorithms to settings where this task segmentation is unavailable, such as continual online learning with a time-varying task.
1 code implementation • ICLR 2021 • Glen Berseth, Daniel Geng, Coline Devin, Nicholas Rhinehart, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
Every living organism struggles against disruptive environmental forces to carve out and maintain an orderly niche.
no code implementations • NeurIPS 2019 • Allan Jabri, Kyle Hsu, Ben Eysenbach, Abhishek Gupta, Sergey Levine, Chelsea Finn
In experiments on vision-based navigation and manipulation domains, we show that the algorithm allows for unsupervised meta-learning that transfers to downstream tasks specified by hand-crafted reward functions and serves as pre-training for more efficient supervised meta-learning of test task distributions.
1 code implementation • ICLR 2020 • Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn
If this is not done, the meta-learner can ignore the task training data and learn a single model that performs all of the meta-training tasks zero-shot, but does not adapt effectively to new image classes.
1 code implementation • 28 Oct 2019 • Rishi Veerapaneni, John D. Co-Reyes, Michael Chang, Michael Janner, Chelsea Finn, Jiajun Wu, Joshua B. Tenenbaum, Sergey Levine
This paper tests the hypothesis that modeling a scene in terms of entities and their local interactions, as opposed to modeling the scene globally, provides a significant benefit in generalizing to physical tasks in a combinatorial space the learner has not encountered before.
8 code implementations • 24 Oct 2019 • Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine
Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors.
Ranked #1 on Meta-Learning on ML10
no code implementations • 24 Oct 2019 • Sudeep Dasari, Frederik Ebert, Stephen Tian, Suraj Nair, Bernadette Bucher, Karl Schmeckpeper, Siddharth Singh, Sergey Levine, Chelsea Finn
This leads to a frequent tension in robotic learning: how can we learn generalizable robotic controllers without having to collect impractically large amounts of data for each separate experiment?
no code implementations • 25 Sep 2019 • Jesse Zhang, Brian Cheung, Chelsea Finn, Dinesh Jayaraman, Sergey Levine
We study the problem of safe adaptation: given a model trained on a variety of past experiences for some task, can this model learn to perform that task in a new situation while avoiding catastrophic failure?
no code implementations • 25 Sep 2019 • Oleh Rybkin, Karl Pertsch, Frederik Ebert, Dinesh Jayaraman, Chelsea Finn, Sergey Levine
Prior work on video generation largely focuses on prediction models that only observe frames from the beginning of the video.
no code implementations • 25 Sep 2019 • Glen Berseth, Daniel Geng, Coline Devin, Dinesh Jayaraman, Chelsea Finn, Sergey Levine
All living organisms struggle against the forces of nature to carve out niches where they can maintain relative stasis.
Unsupervised Pre-training Unsupervised Reinforcement Learning
no code implementations • 25 Sep 2019 • Tianhe Yu, Saurabh Kumar, Eric Mitchell, Abhishek Gupta, Karol Hausman, Sergey Levine, Chelsea Finn
Deep learning enables training of large and flexible function approximators from scratch at the cost of large amounts of data.
no code implementations • 25 Sep 2019 • Russell Mendonca, Xinyang Geng, Chelsea Finn, Sergey Levine
Reinforcement learning algorithms can acquire policies for complex tasks automatically, however the number of samples required to learn a diverse set of skills can be prohibitively large.
1 code implementation • NeurIPS 2019 • Lantao Yu, Tianhe Yu, Chelsea Finn, Stefano Ermon
Critically, our model can infer rewards for new, structurally-similar tasks from a single demonstration.
Ranked #1 on MuJoCo Games on Sawyer Pusher
1 code implementation • ICLR 2020 • Suraj Nair, Chelsea Finn
Video prediction models combined with planning algorithms have shown promise in enabling robots to learn to perform many vision-based tasks through only self-supervision, reaching novel goals in cluttered scenes with unseen objects.
6 code implementations • NeurIPS 2019 • Aravind Rajeswaran, Chelsea Finn, Sham Kakade, Sergey Levine
By drawing upon implicit differentiation, we develop the implicit MAML algorithm, which depends only on the solution to the inner level optimization and not the path taken by the inner loop optimizer.
no code implementations • 24 Jun 2019 • Mark Woodward, Chelsea Finn, Karol Hausman
In this paper, we investigate if, and how, a "helper" agent can be trained to interactively adapt their behavior to maximize the reward of another agent, whom we call the "prime" agent, without observing their reward or receiving explicit demonstrations.
no code implementations • 24 Jun 2019 • Mark Woodward, Chelsea Finn, Karol Hausman
Most importantly, we find that our approach produces an agent that is capable of learning interactively from a human user, without a set of explicit demonstrations or a reward function, and achieving significantly better performance cooperatively with a human than a human performing the task alone.
2 code implementations • NeurIPS 2019 • Yiding Jiang, Shixiang Gu, Kevin Murphy, Chelsea Finn
We find that, using our approach, agents can learn to solve to diverse, temporally-extended tasks such as object sorting and multi-object rearrangement, including from raw pixel observations.
no code implementations • 7 Jun 2019 • Allan Zhou, Eric Jang, Daniel Kappler, Alex Herzog, Mohi Khansari, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Sergey Levine, Chelsea Finn
Imitation learning allows agents to learn complex behaviors from demonstrations.
no code implementations • ICLR 2019 • Kelvin Xu, Ellis Ratner, Anca Dragan, Sergey Levine, Chelsea Finn
A significant challenge for the practical application of reinforcement learning toreal world problems is the need to specify an oracle reward function that correctly defines a task.
no code implementations • ICLR 2019 • Rosen Kralev, Russell Mendonca, Alvin Zhang, Tianhe Yu, Abhishek Gupta, Pieter Abbeel, Sergey Levine, Chelsea Finn
Meta-reinforcement learning aims to learn fast reinforcement learning (RL) procedures that can be applied to new tasks or environments.
no code implementations • ICLR 2019 • Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu
Object-based factorizations provide a useful level of abstraction for interacting with the world.
3 code implementations • 16 Apr 2019 • Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine
In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task.
no code implementations • 11 Apr 2019 • Annie Xie, Frederik Ebert, Sergey Levine, Chelsea Finn
Machine learning techniques have enabled robots to learn narrow, yet complex tasks and also perform broad, yet simple skills with a wide variety of objects.
no code implementations • NeurIPS 2019 • Russell Mendonca, Abhishek Gupta, Rosen Kralev, Pieter Abbeel, Sergey Levine, Chelsea Finn
Reinforcement learning (RL) algorithms have demonstrated promising results on complex tasks, yet often require impractical numbers of samples since they learn from scratch.
7 code implementations • ICLR Workshop LLD 2019 • Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
no code implementations • 11 Mar 2019 • Stephen Tian, Frederik Ebert, Dinesh Jayaraman, Mayur Mudigonda, Chelsea Finn, Roberto Calandra, Sergey Levine
Touch sensing is widely acknowledged to be important for dexterous robotic manipulation, but exploiting tactile sensing for continuous, non-prehensile manipulation is challenging.
1 code implementation • 4 Mar 2019 • Yuxiang Yang, Ken Caluwaerts, Atil Iscen, Jie Tan, Chelsea Finn
To this end, we introduce a method that allows for self-adaptation of learned policies: No-Reward Meta Learning (NoRML).
1 code implementation • ICLR 2020 • Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma
Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions.
Ranked #15 on Video Generation on BAIR Robot Pushing
2 code implementations • 1 Mar 2019 • Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.
Ranked #12 on Atari Games 100k on Atari 100k
no code implementations • ICLR Workshop LLD 2019 • Chelsea Finn, Aravind Rajeswaran, Sham Kakade, Sergey Levine
Meta-learning views this problem as learning a prior over model parameters that is amenable for fast adaptation on a new task, but typically assumes the set of tasks are available together as a batch.
1 code implementation • 14 Feb 2019 • Tianhe Yu, Gleb Shevchuk, Dorsa Sadigh, Chelsea Finn
While reinforcement learning (RL) has the potential to enable robots to autonomously acquire a wide range of skills, in practice, RL usually requires manual, per-task engineering of reward functions, especially in real world settings where aspects of the environment needed to compute progress are not directly accessible.
no code implementations • 28 Dec 2018 • Michael Janner, Sergey Levine, William T. Freeman, Joshua B. Tenenbaum, Chelsea Finn, Jiajun Wu
Object-based factorizations provide a useful level of abstraction for interacting with the world.