no code implementations • 15 Mar 2023 • Garrett Thomas, Ching-An Cheng, Ricky Loynd, Felipe Vieira Frujeri, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov
A rich representation is key to general robotic manipulation, but existing approaches to representation learning require large amounts of multimodal demonstrations.
1 code implementation • NeurIPS 2021 • Garrett Thomas, Yuping Luo, Tengyu Ma
Safe reinforcement learning is a promising path toward applying reinforcement learning algorithms to real-world problems, where suboptimal behaviors may lead to actual negative consequences.
1 code implementation • NeurIPS 2020 • Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma
When the test task distribution is different from the training task distribution, the performance may degrade significantly.
6 code implementations • NeurIPS 2020 • Tianhe Yu, Garrett Thomas, Lantao Yu, Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma
We also characterize the trade-off between the gain and risk of leaving the support of the batch data.
no code implementations • 11 Jul 2019 • Nicholas C. Landolfi, Garrett Thomas, Tengyu Ma
We then adapt the dynamical model with samples from this policy in the real environment.
no code implementations • 20 Mar 2018 • Garrett Thomas, Melissa Chien, Aviv Tamar, Juan Aparicio Ojea, Pieter Abbeel
We propose to leverage this prior knowledge by guiding RL along a geometric motion plan, calculated using the CAD data.
1 code implementation • 28 Sep 2016 • Aviv Tamar, Garrett Thomas, Tianhao Zhang, Sergey Levine, Pieter Abbeel
To bring the next real-world execution closer to the hindsight plan, our approach learns to re-shape the original cost function with the goal of satisfying the following property: short horizon planning (as realistic during real executions) with respect to the shaped cost should result in mimicking the hindsight plan.
8 code implementations • NeurIPS 2016 • Aviv Tamar, Yi Wu, Garrett Thomas, Sergey Levine, Pieter Abbeel
We introduce the value iteration network (VIN): a fully differentiable neural network with a `planning module' embedded within.