Video Object Detection

66 papers with code • 7 benchmarks • 10 datasets

Video object detection is the task of detecting objects from a video as opposed to images.

( Image credit: Learning Motion Priors for Efficient Video Object Detection )

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Object Detection

Dataset	Best Model	Compare
ImageNet VID	DiffusionVID (Swin-B)	See all
EPIC KITCHENS-seen splits	Temporal ROI Align	See all
EPIC KITCHENS-unseen splits	Temporal ROI Align	See all
USC-GRAD-STDdb	SLTnet FPN-X101	See all
EPIC-KITCHENS-55	Ours (Faster RCNN)	See all
YT-BB		See all
Waymo Open Dataset		See all

Libraries

Use these libraries to find Video Object Detection models and implementations

guanxiongsun/vfe.pytorch

4 papers

open-mmlab/mmtracking

3 papers

3,369

lingyunwu14/STFT

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Sequence Level Semantics Aggregation for Video Object Detection

open-mmlab/mmtracking • • ICCV 2019

In this work, we argue that aggregating features in the full-sequence level will lead to more discriminative and robust features for video object detection.

Paper
Code

Relation Distillation Networks for Video Object Detection

Scalsol/mega.pytorch • • ICCV 2019

In this paper, we introduce a new design to capture the interactions across the objects in spatio-temporal context.

Paper
Code

Detection and Tracking Meet Drones Challenge

VisDrone/VisDrone-Dataset • 16 Jan 2020

We provide a large-scale drone captured dataset, VisDrone, which includes four tracks, i. e., (1) image object detection, (2) video object detection, (3) single object tracking, and (4) multi-object tracking.

Paper
Code

Plug & Play Convolutional Regression Tracker for Video Object Detection

YeLyuUT/VOSDetectron • • 2 Mar 2020

As the tracker reuses the features from the detector, it is a very light-weighted increment to the detection network.

Paper
Code

Memory Enhanced Global-Local Aggregation for Video Object Detection

Scalsol/mega.pytorch • • CVPR 2020

We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information.

Paper
Code

Instance-aware, Context-focused, and Memory-efficient Weakly Supervised Object Detection

NVlabs/wetectron • • CVPR 2020

Weakly supervised learning has emerged as a compelling tool for object detection by reducing the need for strong supervision during training.

Paper
Code

Spatio-temporal Prompting Network for Robust Video Feature Extraction

guanxiongsun/vfe.pytorch • • ICCV 2023

Then, these video prompts are prepended to the patch embeddings of the current frame as the updated input for video feature extraction.

Paper
Code

Seq-NMS for Video Object Detection

tmoopenn/seq-nms • 26 Feb 2016

Video object detection is challenging because objects that are easily detected in one frame may be difficult to detect in another frame within the same clip.

Paper
Code

Structure-measure: A New Way to Evaluate Foreground Maps

DengPingFan/S-measure • ICCV 2017

Our new measure simultaneously evaluates region-aware and object-aware structural similarity between a SM and a GT map.

Paper
Code

Optimizing Video Object Detection via a Scale-Time Lattice

guanfuchen/video_obj • • CVPR 2018

High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e. g. those that require detecting objects from video streams in real time.

Paper
Code

Video Object Detection

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result