Video-Adverb Retrieval

4 papers with code • 5 benchmarks • 5 datasets

The bidirectional video-adverb retrieval task aims at retrieving adverbs that match an action in a video and vice versa.

Most implemented papers

Action Modifiers: Learning from Adverbs in Instructional Videos

hazeld/action-modifiers CVPR 2020

We present a method to learn a representation for adverbs from instructional videos using weak supervision from the accompanying narrations.

How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs

hazeld/pseudoadverbs CVPR 2022

We aim to understand how actions are performed and identify subtle differences, such as 'fold firmly' vs. 'fold gently'.

Learning Action Changes by Measuring Verb-Adverb Textual Relationships

dmoltisanti/air-cvpr23 CVPR 2023

The goal of this work is to understand the way actions are performed in videos.

Video-adverb retrieval with compositional adverb-action embeddings

ExplainableML/ReGaDa 26 Sep 2023

We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.