no code implementations • 2 Nov 2023 • Wentao Yuan, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox
With the advent of large language models and large-scale robotic datasets, there has been tremendous progress in high-level decision-making for object manipulation.
no code implementations • 18 Apr 2023 • Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Adam Fishman, Dieter Fox
CabiNet is a collision model that accepts object and scene point clouds, captured from a single-view depth observation, and predicts collisions for SE(3) object poses in the scene.
no code implementations • 13 Dec 2022 • Yann Labbé, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, Josef Sivic
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
no code implementations • 28 Sep 2022 • Zoey Qiuyu Chen, Karl Van Wyk, Yu-Wei Chao, Wei Yang, Arsalan Mousavian, Abhishek Gupta, Dieter Fox
The policy learned from our dataset can generalize well on unseen object poses in both simulation and the real world
no code implementations • 22 Sep 2022 • Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, Animesh Garg
To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information.
no code implementations • CVPR 2022 • Ankit Goyal, Arsalan Mousavian, Chris Paxton, Yu-Wei Chao, Brian Okorn, Jia Deng, Dieter Fox
Accurate object rearrangement from vision is a crucial problem for a wide variety of real-world robotics applications in unstructured environments.
1 code implementation • 29 Jun 2021 • Christopher Xie, Arsalan Mousavian, Yu Xiang, Dieter Fox
We postulate that a network architecture that encodes relations between objects at a high-level can be beneficial.
no code implementations • 2 Jun 2021 • Ahmed H. Qureshi, Arsalan Mousavian, Chris Paxton, Michael C. Yip, Dieter Fox
We propose NeRP (Neural Rearrangement Planning), a deep learning based approach for multi-step neural object rearrangement planning which works with never-before-seen objects, that is trained on simulation data, and generalizes to the real world.
1 code implementation • CVPR 2021 • Luyang Zhu, Arsalan Mousavian, Yu Xiang, Hammad Mazhar, Jozef van Eenbergen, Shoubhik Debnath, Dieter Fox
Key to our approach is a local implicit neural representation built on ray-voxel pairs that allows our method to generalize to unseen objects and achieve fast inference speed.
1 code implementation • 25 Mar 2021 • Martin Sundermeyer, Arsalan Mousavian, Rudolph Triebel, Dieter Fox
Our novel grasp representation treats 3D points of the recorded point cloud as potential grasp contacts.
1 code implementation • 21 Nov 2020 • Michael Danielczuk, Arsalan Mousavian, Clemens Eppner, Dieter Fox
The learned model outperforms both traditional pipelines and learned ablations by 9. 8% in accuracy on a dataset of simulated collision queries and is 75x faster than the best-performing baseline.
2 code implementations • 18 Nov 2020 • Clemens Eppner, Arsalan Mousavian, Dieter Fox
We introduce ACRONYM, a dataset for robot grasp planning based on physics simulation.
no code implementations • 17 Nov 2020 • Wei Yang, Chris Paxton, Arsalan Mousavian, Yu-Wei Chao, Maya Cakmak, Dieter Fox
We demonstrate the generalizability, usability, and robustness of our approach on a novel benchmark set of 26 diverse household objects, a user study with naive users (N=6) handing over a subset of 15 objects, and a systematic evaluation examining different ways of handing objects.
no code implementations • 17 Nov 2020 • Shohin Mukherjee, Chris Paxton, Arsalan Mousavian, Adam Fishman, Maxim Likhachev, Dieter Fox
Zero-shot execution of unseen robotic tasks is important to allowing robots to perform a wide variety of tasks in human environments, but collecting the amounts of data necessary to train end-to-end policies in the real-world is often infeasible.
1 code implementation • 2 Oct 2020 • Lirui Wang, Yu Xiang, Wei Yang, Arsalan Mousavian, Dieter Fox
We demonstrate that our learned policy can be integrated into a tabletop 6D grasping system and a human-robot handover system to improve the grasping performance of unseen objects.
1 code implementation • 30 Jul 2020 • Yu Xiang, Christopher Xie, Arsalan Mousavian, Dieter Fox
In this work, we propose a new method for unseen object instance segmentation by learning RGB-D feature embeddings from synthetic data.
1 code implementation • 16 Jul 2020 • Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox
We also show that our method can segment unseen objects for robot grasping.
no code implementations • 11 Dec 2019 • Clemens Eppner, Arsalan Mousavian, Dieter Fox
With the increasing speed and quality of physics simulations, generating large-scale grasping data sets that feed learning algorithms is becoming more and more popular.
1 code implementation • 8 Dec 2019 • Adithyavairavan Murali, Arsalan Mousavian, Clemens Eppner, Chris Paxton, Dieter Fox
Grasping in cluttered environments is a fundamental but challenging robotic skill.
1 code implementation • CVPR 2020 • Keunhong Park, Arsalan Mousavian, Yu Xiang, Dieter Fox
We evaluate the performance of our method for unseen object pose estimation on MOPED as well as the ModelNet and LINEMOD datasets.
3 code implementations • 23 Sep 2019 • Xinke Deng, Yu Xiang, Arsalan Mousavian, Clemens Eppner, Timothy Bretl, Dieter Fox
In this way, our system is able to continuously collect data and improve its pose estimation modules.
Robotics
no code implementations • 30 Jul 2019 • Christopher Xie, Yu Xiang, Arsalan Mousavian, Dieter Fox
We show that our method, trained on this dataset, can produce sharp and accurate masks, outperforming state-of-the-art methods on unseen object instance segmentation.
2 code implementations • ICCV 2019 • Arsalan Mousavian, Clemens Eppner, Dieter Fox
We evaluate our approach in simulation and real-world robot experiments.
1 code implementation • 22 May 2019 • Xinke Deng, Arsalan Mousavian, Yu Xiang, Fei Xia, Timothy Bretl, Dieter Fox
In this work, we formulate the 6D object pose tracking problem in the Rao-Blackwellized particle filtering framework, where the 3D rotation and the 3D translation of an object are decoupled.
3 code implementations • 15 May 2018 • Arsalan Mousavian, Alexander Toshev, Marek Fiser, Jana Kosecka, Ayzaan Wahid, James Davidson
We propose to using high level semantic and contextual features including segmentation and detection masks obtained by off-the-shelf state-of-the-art vision as observations and use deep network to learn the navigation policy.
no code implementations • 25 Feb 2017 • Georgios Georgakis, Arsalan Mousavian, Alexander C. Berg, Jana Kosecka
In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection.
11 code implementations • CVPR 2017 • Arsalan Mousavian, Dragomir Anguelov, John Flynn, Jana Kosecka
In contrast to current techniques that only regress the 3D orientation of an object, our method first regresses relatively stable 3D object properties using a deep convolutional neural network and then combines these estimates with geometric constraints provided by a 2D object bounding box to produce a complete 3D bounding box.
Ranked #9 on Vehicle Pose Estimation on KITTI Cars Hard
no code implementations • 26 Sep 2016 • Georgios Georgakis, Md. Alimoor Reza, Arsalan Mousavian, Phi-Hung Le, Jana Kosecka
This paper presents a new multi-view RGB-D dataset of nine kitchen scenes, each containing several objects in realistic cluttered environments including a subset of objects from the BigBird dataset.
no code implementations • 1 Sep 2016 • Arsalan Mousavian, Jana Kosecka
In this work we present an approach for geo-locating a novel view and determining camera location and orientation using a map and a sparse set of geo-tagged reference views.
no code implementations • 25 Apr 2016 • Arsalan Mousavian, Hamed Pirsiavash, Jana Kosecka
The proposed model is trained and evaluated on NYUDepth V2 dataset outperforming the state of the art methods on semantic segmentation and achieving comparable results on the task of depth estimation.
no code implementations • 20 Sep 2015 • Arsalan Mousavian, Jana Kosecka
Several recent approaches showed how the representations learned by Convolutional Neural Networks can be repurposed for novel tasks.