Video Object Detection
66 papers with code • 7 benchmarks • 10 datasets
Video object detection is the task of detecting objects from a video as opposed to images.
( Image credit: Learning Motion Priors for Efficient Video Object Detection )
Libraries
Use these libraries to find Video Object Detection models and implementationsDatasets
Latest papers
Objects do not disappear: Video object detection by single-frame object location anticipation
2) Improved efficiency by only doing the expensive feature computations on a small subset of all frames.
Video object detection for privacy-preserving patient monitoring in intensive care
In this paper, we propose a new method for exploiting information in the temporal succession of video frames.
3D Video Object Detection with Learnable Object-Centric Global Optimization
We explore long-term temporal visual correspondence-based optimization for 3D video object detection in this work.
FAQ: Feature Aggregated Queries for Transformer-based Video Object Detectors
With Transformerbased object detectors getting a better performance on the image domain tasks, recent works began to extend those methods to video object detection.
Feature Aggregated Queries for Transformer-Based Video Object Detectors
With Transformer-based object detectors getting a better performance on the image domain tasks, recent works began to extend those methods to video object detection.
Fewer is More: Efficient Object Detection in Large Aerial Images
Current mainstream object detection methods for large aerial images usually divide large images into patches and then exhaustively detect the objects of interest on all patches, no matter whether there exist objects or not.
Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark
The evaluation of object detection models is usually performed by optimizing a single metric, e. g. mAP, on a fixed set of datasets, e. g. Microsoft COCO and Pascal VOC.
Deep-Learning-Based Computer Vision Approach For The Segmentation Of Ball Deliveries And Tracking In Cricket
Our research tries to solve one of these problems by segmenting ball deliveries in a cricket broadcast using deep learning models, MobileNet and YOLO, thus enabling researchers to use our work as a dataset for their research.
PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection
The temporal information is introduced by the temporal feature aggregation model (TFAM), by conducting an attention mechanism between the context frames and the target frame (i. e., the frame to be detected).
YOLOV: Making Still Image Object Detectors Great at Video Object Detection
On the positive side, the detection in a certain frame of a video, compared with that in a still image, can draw support from other frames.