Video Summarization

68 papers with code • 5 benchmarks • 13 datasets

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey
Image credit: iJRASET

Most implemented papers

ILS-SUMM: Iterated Local Search for Unsupervised Video Summarization

YairShemer/ILS-SUMM 8 Dec 2019

We consider shot-based video summarization where the summary consists of a subset of the video shots which can be of various lengths.

Unsupervised Video Summarization via Attention-Driven Adversarial Learning

e-apostolidis/SUM-GAN-AAE MultiMedia Modeling (MMM) 2019

Experimental evaluation on two datasets (SumMe and TVSum) documents the contribution of the attention auto-encoder to faster and more stable training of the model, resulting in a significant performance improvement with respect to the original model and demonstrating the competitiveness of the proposed SUM-GAN-AAE against the state of the art.

Query-controllable Video Summarization

Jhhuangkay/Query-controllable-Video-Summarization 7 Apr 2020

In this work, we introduce a method which takes a text-based query as input and generates a video summary corresponding to it.

Ultrasound Video Summarization using Deep Reinforcement Learning

Lorna-Liu/ultrasound_vsumm_RL 19 May 2020

We show that our method is superior to alternative video summarization methods and that it preserves essential information required by clinical diagnostic standards.

Multi-modal Summarization for Video-containing Documents

xiyan524/mm-avs 17 Sep 2020

Summarization of multimedia data becomes increasingly significant as it is the basis for many real-world applications, such as question answering, Web search, and so forth.

Spatio-Temporal Stability Analysis in Satellite Image Times Series

mchelali/TemporalStability 9 Oct 2020

Satellite Image Time Series (SITS) provide valuable information for the study of the Earth’s surface.

Siamese Tracking with Lingual Object Constraints

CMFiltenborg/lingually_constrained_tracking 23 Nov 2020

Classically, visual object tracking involves following a target object throughout a given video, and it provides us the motion trajectory of the object.

DSNet: A Flexible Detect-to-Summarize Network for Video Summarization

li-plus/DSNet 1 Dec 2020

In this paper, we propose a Detect-to-Summarize network (DSNet) framework for supervised video summarization.

Movie Summarization via Sparse Graph Construction

ppapalampidi/GraphTP 14 Dec 2020

We summarize full-length movies by creating shorter videos containing their most informative scenes.