no code implementations • 1 Apr 2024 • ShiZhe Chen, Ricardo Garcia, Ivan Laptev, Cordelia Schmid
SUGAR employs a versatile transformer-based model to jointly address five pre-training tasks, namely cross-modal knowledge distillation for semantic learning, masked point modeling to understand geometry structures, grasping pose synthesis for object affordance, 3D instance segmentation and referring expression grounding to analyze cluttered scenes.
1 code implementation • 27 Sep 2023 • ShiZhe Chen, Ricardo Garcia, Cordelia Schmid, Ivan Laptev
The ability for robots to comprehend and execute manipulation tasks based on natural language instructions is a long-term goal in robotics.
Ranked #5 on Robot Manipulation on RLBench
no code implementations • 28 Jul 2023 • Ricardo Garcia, Robin Strudel, ShiZhe Chen, Etienne Arlaud, Ivan Laptev, Cordelia Schmid
While previous work mainly evaluates DR for disembodied tasks, such as pose estimation and object detection, here we systematically explore visual domain randomization methods and benchmark them on a rich set of challenging robotic manipulation tasks.
2 code implementations • 11 Sep 2022 • Pierre-Louis Guhur, ShiZhe Chen, Ricardo Garcia, Makarand Tapaswi, Ivan Laptev, Cordelia Schmid
In human environments, robots are expected to accomplish a variety of manipulation tasks given simple natural language instructions.
Ranked #2 on Robot Manipulation on RLBench (Succ. Rate (10 tasks, 100 demos/task) metric)
7 code implementations • ICCV 2021 • Robin Strudel, Ricardo Garcia, Ivan Laptev, Cordelia Schmid
In this paper we introduce Segmenter, a transformer model for semantic segmentation.
Ranked #15 on Semantic Segmentation on PASCAL Context
1 code implementation • 25 Aug 2020 • Robin Strudel, Ricardo Garcia, Justin Carpentier, Jean-Paul Laumond, Ivan Laptev, Cordelia Schmid
Motion planning and obstacle avoidance is a key challenge in robotics applications.
Robotics