no code implementations • ICCV 2023 • Ari Seff, Brian Cera, Dian Chen, Mason Ng, Aurick Zhou, Nigamaa Nayakanti, Khaled S. Refaat, Rami Al-Rfou, Benjamin Sapp
Here, we represent continuous trajectories as sequences of discrete motion tokens and cast multi-agent motion prediction as a language modeling task over this domain.
1 code implementation • 12 Jul 2022 • Nigamaa Nayakanti, Rami Al-Rfou, Aurick Zhou, Kratarth Goel, Khaled S. Refaat, Benjamin Sapp
In this paper, we present Wayformer, a family of attention based architectures for motion forecasting that are simple and homogeneous.
Ranked #7 on Motion Forecasting on Argoverse CVPR 2020
no code implementations • NeurIPS 2021 • Aurick Zhou, Sergey Levine
When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. While improving the robustness of neural networks is one promising approach to mitigate this issue, an appealing alternate to robustifying networks against all possible test-time shifts is to instead directly adapt them to unlabeled inputs from the particular distribution shift we encounter at test time. However, this poses a challenging question: in the standard Bayesian model for supervised learning, unlabeled inputs are conditionally independent of model parameters when the labels are unobserved, so what can unlabeled data tell us about the model parameters at test-time?
no code implementations • 27 Sep 2021 • Aurick Zhou, Sergey Levine
When faced with distribution shift at test time, deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
no code implementations • 15 Jul 2021 • Kevin Li, Abhishek Gupta, Ashwin Reddy, Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine
In this work, we show that an uncertainty aware classifier can solve challenging reinforcement learning problems by both encouraging exploration and provided directed guidance towards positive outcomes.
no code implementations • 1 Jan 2021 • Kevin Li, Abhishek Gupta, Vitchyr H. Pong, Ashwin Reddy, Aurick Zhou, Justin Yu, Sergey Levine
In this work, we study a more tractable class of reinforcement learning problems defined by data that provides examples of successful outcome states.
no code implementations • 5 Nov 2020 • Aurick Zhou, Sergey Levine
In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks.
no code implementations • 28 Sep 2020 • Aurick Zhou, Sergey Levine
In this paper, we propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation, calibration, and out-of-distribution robustness with deep networks.
16 code implementations • NeurIPS 2020 • Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine
We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.
7 code implementations • ICLR Workshop LLD 2019 • Kate Rakelly, Aurick Zhou, Deirdre Quillen, Chelsea Finn, Sergey Levine
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
no code implementations • 26 Dec 2018 • Tuomas Haarnoja, Sehoon Ha, Aurick Zhou, Jie Tan, George Tucker, Sergey Levine
In this paper, we propose a sample-efficient deep RL algorithm based on maximum entropy RL that requires minimal per-task tuning and only a modest number of trials to learn neural network policies.
50 code implementations • 13 Dec 2018 • Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, Sergey Levine
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
1 code implementation • 19 Mar 2018 • Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine
Second, we show that policies learned with soft Q-learning can be composed to create new policies, and that the optimality of the resulting policy can be bounded in terms of the divergence between the composed policies.
76 code implementations • ICML 2018 • Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine
A platform for Applied Reinforcement Learning (Applied RL)
Ranked #1 on Continuous Control on Lunar Lander (OpenAI Gym)