Video Description

26 papers with code • 0 benchmarks • 7 datasets

The goal of automatic Video Description is to tell a story about events happening in a video. While early Video Description methods produced captions for short clips that were manually segmented to contain a single event of interest, more recently dense video captioning has been proposed to both segment distinct events in time and describe them in a series of coherent sentences. This problem is a generalization of dense image region captioning and has many practical applications, such as generating textual summaries for the visually impaired, or detecting and describing important events in surveillance footage.

Source: Joint Event Detection and Description in Continuous Video Streams

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Description

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Latest papers with no code

Most implemented Social Latest No code

Multi-Layer Content Interaction Through Quaternion Product For Visual Question Answering

no code yet • 3 Jan 2020

To solve the issue for the intermediate layers, we propose an efficient Quaternion Block Network (QBN) to learn interaction not only for the last layer but also for all intermediate layers simultaneously.

Paper
Add Code

Prediction and Description of Near-Future Activities in Video

no code yet • 2 Aug 2019

Most of the existing works on human activity analysis focus on recognition or early recognition of the activity labels from complete or partial observations.

Paper
Add Code

End-to-End Video Captioning

no code yet • 4 Apr 2019

The decoder is then optimised on such static features to generate the video's description.

Paper
Add Code

A Dataset for Telling the Stories of Social Media Videos

no code yet • EMNLP 2018

Video content on social media platforms constitutes a major part of the communication between people, as it allows everyone to share their stories.

Paper
Add Code

Incorporating Background Knowledge into Video Description Generation

no code yet • EMNLP 2018

We develop an approach that uses video meta-data to retrieve topically related news documents for a video and extracts the events and named entities from these documents.

Paper
Add Code

Attentive Sequence to Sequence Translation for Localizing Clips of Interest by Natural Language Descriptions

no code yet • 27 Aug 2018

We validate the effectiveness of our ASST on two large-scale datasets.

Paper
Add Code

Bridge Video and Text with Cascade Syntactic Structure

no code yet • COLING 2018

We present a video captioning approach that encodes features by progressively completing syntactic structure (LSTM-CSS).

Paper
Add Code

Multimodal Neural Machine Translation for Low-resource Language Pairs using Synthetic Data

no code yet • WS 2018

In this paper, we investigate the effectiveness of training a multimodal neural machine translation (MNMT) system with image features for a low-resource language pair, Hindi and English, using synthetic data.

Paper
Add Code

Video Description: A Survey of Methods, Datasets and Evaluation Metrics

no code yet • 1 Jun 2018

Video description is the automatic generation of natural language sentences that describe the contents of a given video.

Paper
Add Code

Interpretable Video Captioning via Trajectory Structured Localization

no code yet • CVPR 2018

Automatically describing open-domain videos with natural language are attracting increasing interest in the field of artificial intelligence.

Paper
Add Code

Video Description

Benchmarks Add a Result

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result