Search Results for author: Fadime Sener

Found 17 papers, 8 papers with code

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization

1 code implementation • 28 Mar 2024 • Anna Kukleva, Fadime Sener, Edoardo Remelli, Bugra Tekin, Eric Sauser, Bernt Schiele, Shugao Ma

Lately, there has been growing interest in adapting vision-language models (VLMs) to image and third-person video classification due to their success in zero-shot recognition.

Video Classification Zero-Shot Learning

Paper
Code

DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual Descriptions

no code implementations • 26 Mar 2024 • Sammy Christen, Shreyas Hampali, Fadime Sener, Edoardo Remelli, Tomas Hodan, Eric Sauser, Shugao Ma, Bugra Tekin

In the grasping stage, the model only generates hand motions, whereas in the interaction phase both hand and object poses are synthesized.

Object

Paper
Add Code

On the Utility of 3D Hand Poses for Action Recognition

1 code implementation • 14 Mar 2024 • Md Salman Shamil, Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao

3D hand poses are an under-explored modality for action recognition.

Ranked #1 on 3D Action Recognition on Assembly101

3D Action Recognition

Paper
Code

Opening the Vocabulary of Egocentric Actions

1 code implementation • NeurIPS 2023 • Dibyadip Chatterjee, Fadime Sener, Shugao Ma, Angela Yao

Given a set of verbs and objects observed during training, the goal is to generalize the verbs to an open vocabulary of actions with seen and novel objects.

Ranked #1 on Open Vocabulary Action Recognition on Assembly101 (using extra training data)

Object Open Vocabulary Action Recognition

Paper
Code

Every Mistake Counts in Assembly

no code implementations • 31 Jul 2023 • Guodong Ding, Fadime Sener, Shugao Ma, Angela Yao

Our framework constructs a knowledge base with spatial and temporal beliefs based on observed mistakes.

Paper
Add Code

AssemblyHands: Towards Egocentric Activity Understanding via 3D Hand Pose Estimation

no code implementations • CVPR 2023 • Takehiko Ohkawa, Kun He, Fadime Sener, Tomas Hodan, Luan Tran, Cem Keskin

To obtain high-quality 3D hand pose annotations for the egocentric images, we develop an efficient pipeline, where we use an initial set of manual annotations to train a model to automatically annotate a much larger dataset.

3D Hand Pose Estimation Action Classification

Paper
Add Code

Temporal Action Segmentation: An Analysis of Modern Techniques

2 code implementations • 19 Oct 2022 • Guodong Ding, Fadime Sener, Angela Yao

Temporal action segmentation (TAS) in videos aims at densely identifying video frames in minutes-long videos with multiple action classes.

Action Segmentation Segmentation +1

116

Paper
Code

Assembly101: A Large-Scale Multi-View Video Dataset for Understanding Procedural Activities

1 code implementation • CVPR 2022 • Fadime Sener, Dibyadip Chatterjee, Daniel Shelepov, Kun He, Dipika Singhania, Robert Wang, Angela Yao

Assembly101 is a new procedural activity dataset featuring 4321 videos of people assembling and disassembling 101 "take-apart" toy vehicles.

3D Action Recognition Action Anticipation +2

Paper
Code

Transferring Knowledge from Text to Video: Zero-Shot Anticipation for Procedural Actions

no code implementations • 6 Jun 2021 • Fadime Sener, Rishabh Saraf, Angela Yao

Can we teach a robot to recognize and make predictions for activities that it has never seen before?

Zero-Shot Learning

Paper
Add Code

Transformed ROIs for Capturing Visual Transformations in Videos

no code implementations • 6 Jun 2021 • Abhinav Rai, Fadime Sener, Angela Yao

Modeling the visual changes that an action brings to a scene is critical for video understanding.

Action Recognition Video Understanding

Paper
Add Code

Technical Report: Temporal Aggregate Representations

1 code implementation • 6 Jun 2021 • Fadime Sener, Dibyadip Chatterjee, Angela Yao

At what temporal scale should they be derived?

Ranked #5 on Action Anticipation on EPIC-KITCHENS-100 (test)

Action Anticipation Action Recognition +1

Paper
Code

Temporal Aggregate Representations for Long-Range Video Understanding

2 code implementations • ECCV 2020 • Fadime Sener, Dipika Singhania, Angela Yao

Future prediction, especially in long-range videos, requires reasoning from current and past observations.

Ranked #2 on Action Anticipation on Assembly101

Action Anticipation Action Recognition +4

Paper
Code

Unsupervised learning of action classes with continuous temporal embedding

2 code implementations • CVPR 2019 • Anna Kukleva, Hilde Kuehne, Fadime Sener, Juergen Gall

The task of temporally detecting and segmenting actions in untrimmed videos has seen an increased attention recently.

Paper
Code

Learning Style Compatibility for Furniture

no code implementations • 9 Dec 2018 • Divyansh Aggarwal, Elchin Valiyev, Fadime Sener, Angela Yao

When judging style, a key question that often arises is whether or not a pair of objects are compatible with each other.

Attribute

Paper
Add Code

Zero-Shot Anticipation for Instructional Activities

no code implementations • ICCV 2019 • Fadime Sener, Angela Yao

How can we teach a robot to predict what will happen next for an activity it has never seen before?

Zero-Shot Learning

Paper
Add Code

Unsupervised Learning and Segmentation of Complex Activities from Video

no code implementations • CVPR 2018 • Fadime Sener, Angela Yao

This paper presents a new method for unsupervised segmentation of complex activities from video into multiple steps, or sub-activities, without any textual input.

Paper
Add Code

DRAW: Deep networks for Recognizing styles of Artists Who illustrate children's books

no code implementations • 10 Apr 2017 • Samet Hicsonmez, Nermin Samet, Fadime Sener, Pinar Duygulu

The style was noticeable in other characters of the same illustrator in different books as well.

General Classification valid

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.