Long Video Retrieval (Background Removed)

6 papers with code • 1 benchmarks • 1 datasets

Retrieve the long videos given all subtitles.

Datasets


Most implemented papers

HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips

antoine77340/MIL-NCE_HowTo100M ICCV 2019

In this work, we propose instead to learn such embeddings from video data with readily available natural language annotations in the form of automatically transcribed narrations.

VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding

pytorch/fairseq EMNLP 2021

We present VideoCLIP, a contrastive approach to pre-train a unified model for zero-shot video and text understanding, without using any labels on downstream tasks.

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

brian7685/Multimodal-Clustering-Network ICCV 2021

Multimodal self-supervised learning is getting more and more attention as it allows not only to train large networks without human supervision but also to search and retrieve data across various modalities.

TempCLR: Temporal Alignment Representation with Contrastive Learning

yyuncong/tempclr 28 Dec 2022

For long videos, given a paragraph of description where the sentences describe different segments of the video, by matching all sentence-clip pairs, the paragraph and the full video are aligned implicitly.

Multi-granularity Correspondence Learning from Long-term Noisy Videos

XLearning-SCU/2024-ICLR-Norton 30 Jan 2024

Existing video-language studies mainly focus on learning short video clips, leaving long-term temporal dependencies rarely explored due to over-high computational cost of modeling long videos.