Search Results for author: Dídac Surís

Found 9 papers, 6 papers with code

pix2gestalt: Amodal Segmentation by Synthesizing Wholes

1 code implementation • 25 Jan 2024 • Ege Ozguroglu, Ruoshi Liu, Dídac Surís, Dian Chen, Achal Dave, Pavel Tokmakov, Carl Vondrick

We introduce pix2gestalt, a framework for zero-shot amodal segmentation, which learns to estimate the shape and appearance of whole objects that are only partially visible behind occlusions.

3D Reconstruction Object Recognition +1

Paper
Code

ViperGPT: Visual Inference via Python Execution for Reasoning

1 code implementation • ICCV 2023 • Dídac Surís, Sachit Menon, Carl Vondrick

Answering visual queries is a complex task that requires both visual processing and reasoning.

Ranked #10 on Zero-Shot Video Question Answer on NExT-QA

Code Generation Zero-Shot Video Question Answer

1,609

Paper
Code

FLEX: Full-Body Grasping Without Full-Body Grasps

no code implementations • CVPR 2023 • Purva Tendulkar, Dídac Surís, Carl Vondrick

Towards this goal, we address the task of generating a virtual human -- hands and full body -- grasping everyday objects.

Paper
Add Code

Representing Spatial Trajectories as Distributions

no code implementations • 4 Oct 2022 • Dídac Surís, Carl Vondrick

We introduce a representation learning framework for spatial trajectories.

Representation Learning

Paper
Add Code

Learning the Predictability of the Future

1 code implementation • CVPR 2021 • Dídac Surís, Ruoshi Liu, Carl Vondrick

We introduce a framework for learning from unlabeled video what is predictable in the future.

Representation Learning Self-Supervised Action Recognition +1

156

Paper
Code

Globetrotter: Connecting Languages by Connecting Images

1 code implementation • CVPR 2022 • Dídac Surís, Dave Epstein, Carl Vondrick

Machine translation between many languages at once is highly challenging, since training with ground truth requires supervision between all language pairs, which is difficult to obtain.

Machine Translation Retrieval +2

Paper
Code

Learning to Learn Words from Visual Scenes

1 code implementation • ECCV 2020 • Dídac Surís, Dave Epstein, Heng Ji, Shih-Fu Chang, Carl Vondrick

Language acquisition is the process of learning words from the surrounding scene.

Language Acquisition Language Modelling +1

Paper
Code

Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input

no code implementations • ECCV 2018 • David Harwath, Adrià Recasens, Dídac Surís, Galen Chuang, Antonio Torralba, James Glass

In this paper, we explore neural network models that learn to associate segments of spoken audio captions with the semantically relevant portions of natural images that they refer to.

Retrieval

Paper
Add Code

Overcoming catastrophic forgetting with hard attention to the task

2 code implementations • ICML 2018 • Joan Serrà, Dídac Surís, Marius Miron, Alexandros Karatzoglou

In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning.

Ranked #2 on Continual Learning on 20Newsgroup (10 tasks)

Continual Learning Hard Attention

192

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.