Browse SoTA > Computer Vision > Video Captioning > Dense Video Captioning

Dense Video Captioning

8 papers with code · Computer Vision
Subtask of Video Captioning

Benchmarks

Greatest papers with code

Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning

CVPR 2018 JaywongWang/DenseVideoCaptioning

We propose a bidirectional proposal method that effectively exploits both past and future contexts to make proposal predictions.

DENSE VIDEO CAPTIONING

End-to-End Dense Video Captioning with Masked Transformer

CVPR 2018 salesforce/densecap

To address this problem, we propose an end-to-end transformer model for dense video captioning.

DENSE VIDEO CAPTIONING

A Better Use of Audio-Visual Cues: Dense Video Captioning with Bi-modal Transformer

17 May 2020v-iashin/BMT

We show the effectiveness of the proposed model with audio and visual modalities on the dense video captioning task, yet the module is capable of digesting any two modalities in a sequence-to-sequence task.

DENSE VIDEO CAPTIONING TEMPORAL ACTION PROPOSAL GENERATION

Multi-modal Dense Video Captioning

17 Mar 2020v-iashin/MDVC

We apply automatic speech recognition (ASR) system to obtain a temporally aligned textual description of the speech (similar to subtitles) and treat it as a separate input alongside video frames and the corresponding audio track.

DENSE VIDEO CAPTIONING

Towards Automatic Learning of Procedures from Web Instructional Videos

28 Mar 2017LuoweiZhou/ProcNets-YouCook2

To answer this question, we introduce the problem of procedure segmentation--to segment a video procedure into category-independent procedure segments.

DENSE VIDEO CAPTIONING

Dense-Captioning Events in Videos: SYSU Submission to ActivityNet Challenge 2020

21 Jun 2020ttengwang/dense-video-captioning-pytorch

This technical report presents a brief description of our submission to the dense video captioning task of ActivityNet Challenge 2020.

DENSE VIDEO CAPTIONING

Joint Event Detection and Description in Continuous Video Streams

28 Feb 2018VisionLearningGroup/JEDDi-Net

In order to explicitly model temporal relationships between visual events and their captions in a single video, we also propose a two-level hierarchical captioning module that keeps track of context.

DENSE VIDEO CAPTIONING VIDEO UNDERSTANDING

SODA: Story Oriented Dense Video Captioning Evaluation Framework

ECCV 2020 fujiso/SODA

This paper proposes a new evaluation framework, Story Oriented Dense video cAptioning evaluation framework (SODA), for measuring the performance of video story description systems.

DENSE VIDEO CAPTIONING