Video Summarization

68 papers with code • 5 benchmarks • 13 datasets

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey
Image credit: iJRASET

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Summarization

Dataset	Best Model	Compare
SumMe	PGL-SUM	See all
TvSum	RR-STG	See all
Query-Focused Video Summarization Dataset	EgoVLPv2	See all
Shot2Story20K	SUM-shot	See all
videoxum	VTSUM-BLIP	See all

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Pegasus-v1 Technical Report

no code yet • 23 Apr 2024

This technical report introduces Pegasus-1, a multimodal language model specialized in video content understanding and interaction through natural language.

Paper
Add Code

V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning

no code yet • 18 Apr 2024

Recent efforts have been made to expand from unimodal to multimodal video summarization, categorizing the task into three sub-tasks based on the summary's modality: video-to-video (V2V), video-to-text (V2T), and a combination of video and text summarization (V2VT).

Paper
Add Code

Scaling Up Video Summarization Pretraining with Large Language Models

no code yet • 4 Apr 2024

Long-form video content constitutes a significant portion of internet traffic, making automated video summarization an essential research problem.

Paper
Add Code

FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts

no code yet • 26 Mar 2024

Therefore, there is a risk of missing important information when both the teacher's speech and visual information on the blackboard or slides are important, such as in a lecture video.

Paper
Add Code

Large Model based Sequential Keyframe Extraction for Video Summarization

no code yet • 10 Jan 2024

Keyframe extraction aims to sum up a video's semantics with the minimum number of its frames.

Paper
Add Code

Beyond the Frame: Single and mutilple video summarization method with user-defined length

no code yet • 23 Dec 2023

A single or multiple videos can be summarized into a relatively short video using various of techniques from multimodal audio-visual techniques, to natural language processing approaches.

Paper
Add Code

Facilitating the Production of Well-tailored Video Summaries for Sharing on Social Media

no code yet • 5 Dec 2023

This paper presents a web-based tool that facilitates the production of tailored summaries for online sharing on social media.

Paper
Add Code

Video Summarization: Towards Entity-Aware Captions

no code yet • 1 Dec 2023

We also release a large-scale dataset, VIEWS (VIdeo NEWS), to support research on this task.

Paper
Add Code

Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames

no code yet • 28 Nov 2023

It aims to summarize a long video walkthrough of a scene into a small set of frames that are spatially diverse in the scene, which has many impotant applications, such as in surveillance, real estate, and robotics.

Paper
Add Code

Conditional Modeling Based Automatic Video Summarization

no code yet • 20 Nov 2023

The aim of video summarization is to shorten videos automatically while retaining the key information necessary to convey the overall story.

Paper
Add Code

Video Summarization

Benchmarks Add a Result

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result