Video Summarization

68 papers with code • 5 benchmarks • 13 datasets

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey
Image credit: iJRASET

Adopting Self-Supervised Learning into Unsupervised Video Summarization through Restorative Score.

mehryar72/RS-SUM Conference 2023

We show that the reconstruction loss of the model for a video with masked frames correlates with the representativeness of the remaining frames in the video.

7
11 Sep 2023

UniVTG: Towards Unified Video-Language Temporal Grounding

showlab/univtg ICCV 2023

Most methods in this direction develop taskspecific models that are trained with type-specific labels, such as moment retrieval (time interval) and highlight detection (worthiness curve), which limits their abilities to generalize to various VTG tasks and labels.

282
31 Jul 2023

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone

facebookresearch/EgoVLPv2 ICCV 2023

Video-language pre-training (VLP) has become increasingly important due to its ability to generalize to various vision and language tasks.

74
11 Jul 2023

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Jason-Qiu/MultiSum_model 7 Jun 2023

To address these challenges and provide a comprehensive dataset for this new direction, we have meticulously curated the \textbf{MMSum} dataset.

23
07 Jun 2023

Joint Moment Retrieval and Highlight Detection Via Natural Language Queries

skyline-9/visionary-vids 8 May 2023

Video summarization has become an increasingly important task in the field of computer vision due to the vast amount of video content available on the internet.

9
08 May 2023

Hierarchical Video-Moment Retrieval and Step-Captioning

j-min/HiREST CVPR 2023

Our hierarchical benchmark consists of video retrieval, moment retrieval, and two novel moment segmentation and step captioning tasks.

82
29 Mar 2023

SELF-VS: Self-supervised Encoding Learning For Video Summarization

BerserkerMother/Video-Summarization 28 Mar 2023

Empirical evaluations on correlation-based metrics, such as Kendall's $\tau$ and Spearman's $\rho$ demonstrate the superiority of our approach compared to existing state-of-the-art methods in assigning relative scores to the input frames.

2
28 Mar 2023

VideoXum: Cross-modal Visual and Textural Summarization of Videos

jylins/videoxum 21 Mar 2023

We propose a new joint video and text summarization task.

19
21 Mar 2023

Align and Attend: Multimodal Summarization with Dual Contrastive Losses

boheumd/A2Summ CVPR 2023

The goal of multimodal summarization is to extract the most important information from different modalities to form output summaries.

56
13 Mar 2023

VideoSum: A Python Library for Surgical Video Summarization

luiscarlosgph/videosum 15 Feb 2023

It is thus unsurprising that substantial research efforts are made to develop methods aiming at mitigating the scarcity of annotated SDS data.

14
15 Feb 2023