no code implementations • 1 Apr 2024 • Muhammad Zubair Irshad, Sergey Zakahrov, Vitor Guizilini, Adrien Gaidon, Zsolt Kira, Rares Ambrus
Given the capabilities of neural fields in densely representing a 3D scene from 2D images, we ask the question: Can we scale their self-supervised pretraining, specifically using masked autoencoders, to generate effective 3D representations from posed RGB images.
no code implementations • 20 Feb 2024 • Takuya Ikeda, Sergey Zakharov, Tianyi Ko, Muhammad Zubair Irshad, Robert Lee, Katherine Liu, Rares Ambrus, Koichi Nishiwaki
This paper addresses the challenging problem of category-level pose estimation.
no code implementations • 19 Oct 2023 • Mayank Lunayach, Sergey Zakharov, Dian Chen, Rares Ambrus, Zsolt Kira, Muhammad Zubair Irshad
In this work, we address the challenging task of 3D object recognition without the reliance on real-world 3D labeled data.
1 code implementation • ICCV 2023 • Muhammad Zubair Irshad, Sergey Zakharov, Katherine Liu, Vitor Guizilini, Thomas Kollar, Adrien Gaidon, Zsolt Kira, Rares Ambrus
NeO 360's representation allows us to learn from a large collection of unbounded 3D scenes while offering generalizability to new views and novel scenes from as few as a single image during inference.
Ranked #1 on Generalizable Novel View Synthesis on NERDS 360
1 code implementation • CVPR 2023 • Nick Heppert, Muhammad Zubair Irshad, Sergey Zakharov, Katherine Liu, Rares Andrei Ambrus, Jeannette Bohg, Abhinav Valada, Thomas Kollar
We present CARTO, a novel approach for reconstructing multiple articulated objects from a single stereo RGB observation.
2 code implementations • 27 Jul 2022 • Muhammad Zubair Irshad, Sergey Zakharov, Rares Ambrus, Thomas Kollar, Zsolt Kira, Adrien Gaidon
A novel disentangled shape and appearance database of priors is first learned to embed objects in their respective shape and appearance space.
3D Shape Reconstruction From A Single 2D Image 6D Pose Estimation +4
3 code implementations • 3 Mar 2022 • Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone, Zsolt Kira
This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation.
Ranked #1 on 6D Pose Estimation using RGBD on CAMERA25
1 code implementation • 26 Aug 2021 • Muhammad Zubair Irshad, Niluthpol Chowdhury Mithun, Zachary Seymour, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar
This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in continuous 3D environments, which requires an autonomous agent to follow natural language instructions in unseen environments.