Video Object Detection
66 papers with code • 7 benchmarks • 10 datasets
Video object detection is the task of detecting objects from a video as opposed to images.
( Image credit: Learning Motion Priors for Efficient Video Object Detection )
Libraries
Use these libraries to find Video Object Detection models and implementationsDatasets
Latest papers with no code
Memory Maps for Video Object Detection and Tracking on UAVs
This paper introduces a novel approach to video object detection detection and tracking on Unmanned Aerial Vehicles (UAVs).
Bridging Images and Videos: A Simple Learning Framework for Large Vocabulary Video Object Detection
First, no tracking supervisions are in LVIS, which leads to inconsistent learning of detection (with LVIS and TAO) and tracking (only with TAO).
Unifying Tracking and Image-Video Object Detection
We propose TrIVD (Tracking and Image-Video Detection), the first framework that unifies image OD, video OD, and MOT within one end-to-end model.
Efficient Unsupervised Video Object Segmentation Network Based on Motion Guidance
Then, the semantic features of the motion representation are obtained through the local attention mechanism in the motion guidance module to obtain the high-level semantic features of the appearance representation.
BoxMask: Revisiting Bounding Box Supervision for Video Object Detection
We present a new, simple yet effective approach to uplift video object detection.
Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection
Second, motivated by sequence-level semantic aggregation, we incorporate the attention-guided Semantic Proposal Feature Aggregation module to enhance object feature representation before detection.
DFA: Dynamic Feature Aggregation for Efficient Video Object Detection
Video object detection is a fundamental yet challenging task in computer vision.
DAFA: Diversity-Aware Feature Aggregation for Attention-Based Video Object Detection
Our method with global and local attention stages obtains 84. 5 and 85. 9 mAP on ResNet-101 and ResNeXt-101, respectively, thus achieving state-of-the-art performance without requiring additional post-processing methods.
TemporalNet: Real-time 2D-3D Video Object Detection
Our TemporalNet is a plug-and-play block that can be added to a multi-scale single-image detection network without any adjustments in the network architecture.
Real-Time Robust Video Object Detection System Against Physical-World Adversarial Attacks
This work proposes Themis, a software/hardware system to defend against adversarial patches for real-time robust video object detection.