Video Object Detection
65 papers with code • 7 benchmarks • 10 datasets
Video object detection is the task of detecting objects from a video as opposed to images.
( Image credit: Learning Motion Priors for Efficient Video Object Detection )
Libraries
Use these libraries to find Video Object Detection models and implementationsDatasets
Latest papers with no code
Efficient Unsupervised Video Object Segmentation Network Based on Motion Guidance
Then, the semantic features of the motion representation are obtained through the local attention mechanism in the motion guidance module to obtain the high-level semantic features of the appearance representation.
BoxMask: Revisiting Bounding Box Supervision for Video Object Detection
We present a new, simple yet effective approach to uplift video object detection.
Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection
Second, motivated by sequence-level semantic aggregation, we incorporate the attention-guided Semantic Proposal Feature Aggregation module to enhance object feature representation before detection.
DFA: Dynamic Feature Aggregation for Efficient Video Object Detection
Video object detection is a fundamental yet challenging task in computer vision.
DAFA: Diversity-Aware Feature Aggregation for Attention-Based Video Object Detection
Our method with global and local attention stages obtains 84. 5 and 85. 9 mAP on ResNet-101 and ResNeXt-101, respectively, thus achieving state-of-the-art performance without requiring additional post-processing methods.
TemporalNet: Real-time 2D-3D Video Object Detection
Our TemporalNet is a plug-and-play block that can be added to a multi-scale single-image detection network without any adjustments in the network architecture.
Real-Time Robust Video Object Detection System Against Physical-World Adversarial Attacks
This work proposes Themis, a software/hardware system to defend against adversarial patches for real-time robust video object detection.
Graph Neural Network and Spatiotemporal Transformer Attention for 3D Video Object Detection from Point Clouds
In this paper, we propose to detect 3D objects by exploiting temporal information in multiple frames, i. e., the point cloud videos.
QueryProp: Object Query Propagation for High-Performance Video Object Detection
The proposed QueryProp contains two propagation strategies: 1) query propagation is performed from sparse key frames to dense non-key frames to reduce the redundant computation on non-key frames; 2) query propagation is performed from previous key frames to the current key frame to improve feature representation by temporal context modeling.
Exploring Temporally Dynamic Data Augmentation for Video Recognition
The magnitude of augmentation operations on each frame is changed by an effective mechanism, Fourier Sampling that parameterizes diverse, smooth, and realistic temporal variations.