Search Results for author: Gabriel Sarch

Found 6 papers, 3 papers with code

HELPER-X: A Unified Instructable Embodied Agent to Tackle Four Interactive Vision-Language Domains with Memory-Augmented Language Models

no code implementations • 29 Apr 2024 • Gabriel Sarch, Sahil Somani, Raghav Kapoor, Michael J. Tarr, Katerina Fragkiadaki

Recent research on instructable agents has used memory-augmented Large Language Models (LLMs) as task planners, a technique that retrieves language-program examples relevant to the input instruction and uses them as in-context examples in the LLM prompt to improve the performance of the LLM in inferring the correct action and task plans.

Instruction Following

Paper
Add Code

ODIN: A Single Model for 2D and 3D Segmentation

1 code implementation • 4 Jan 2024 • Ayush Jain, Pushkal Katara, Nikolaos Gkanatsios, Adam W. Harley, Gabriel Sarch, Kriti Aggarwal, Vishrav Chaudhary, Katerina Fragkiadaki

The gap in performance between methods that consume posed images versus post-processed 3D point clouds has fueled the belief that 2D and 3D perception require distinct model architectures.

Ranked #1 on 3D Instance Segmentation on ScanNet200

3D Instance Segmentation 3D Semantic Segmentation +1

Paper
Code

Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models

no code implementations • 23 Oct 2023 • Gabriel Sarch, Yue Wu, Michael J. Tarr, Katerina Fragkiadaki

Pre-trained and frozen large language models (LLMs) can effectively map simple scene rearrangement instructions to programs over a robot's visuomotor functions through appropriate few-shot example prompting.

Prompt Engineering Retrieval

Paper
Add Code

3D View Prediction Models of the Dorsal Visual Stream

no code implementations • 4 Sep 2023 • Gabriel Sarch, Hsiao-Yu Fish Tung, Aria Wang, Jacob Prince, Michael Tarr

Deep neural network representations align well with brain activity in the ventral visual stream.

Paper
Add Code

TIDEE: Tidying Up Novel Rooms using Visuo-Semantic Commonsense Priors

1 code implementation • 21 Jul 2022 • Gabriel Sarch, Zhaoyuan Fang, Adam W. Harley, Paul Schydlo, Michael J. Tarr, Saurabh Gupta, Katerina Fragkiadaki

We introduce TIDEE, an embodied agent that tidies up a disordered scene based on learned commonsense object placement and room arrangement priors.

Object

Paper
Code

Move to See Better: Self-Improving Embodied Object Detection

1 code implementation • 30 Nov 2020 • Zhaoyuan Fang, Ayush Jain, Gabriel Sarch, Adam W. Harley, Katerina Fragkiadaki

Experiments on both indoor and outdoor datasets show that (1) our method obtains high-quality 2D and 3D pseudo-labels from multi-view RGB-D data; (2) fine-tuning with these pseudo-labels improves the 2D detector significantly in the test environment; (3) training a 3D detector with our pseudo-labels outperforms a prior self-supervised method by a large margin; (4) given weak supervision, our method can generate better pseudo-labels for novel objects.

Object object-detection +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.