Object Detection
3725 papers with code • 91 benchmarks • 262 datasets
Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.
The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:
-
One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.
-
Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.
The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.
( Image credit: Detectron )
Libraries
Use these libraries to find Object Detection models and implementationsDatasets
Subtasks
- 3D Object Detection
- Real-Time Object Detection
- RGB Salient Object Detection
- Few-Shot Object Detection
- Few-Shot Object Detection
- Video Object Detection
- RGB-D Salient Object Detection
- Open Vocabulary Object Detection
- Object Detection In Aerial Images
- Weakly Supervised Object Detection
- Robust Object Detection
- Small Object Detection
- Medical Object Detection
- Zero-Shot Object Detection
- Open World Object Detection
- Co-Salient Object Detection
- Dense Object Detection
- Object Proposal Generation
- Video Salient Object Detection
- Camouflaged Object Segmentation
- License Plate Detection
- Head Detection
- Multiview Detection
- 3D Object Detection From Monocular Images
- One-Shot Object Detection
- Moving Object Detection
- Surgical tool detection
- Described Object Detection
- Body Detection
- Pupil Detection
- Object Detection In Indoor Scenes
- Class-agnostic Object Detection
- Semantic Part Detection
- Object Skeleton Detection
- Fish Detection
- Multiple Affordance Detection
- Weakly Supervised 3D Detection
Latest papers with no code
AutoGluon-Multimodal (AutoMM): Supercharging Multimodal AutoML with Foundation Models
AutoGluon-Multimodal (AutoMM) is introduced as an open-source AutoML library designed specifically for multimodal learning.
Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks
Latency attacks against object detection represent a variant of adversarial attacks that aim to inflate the inference time by generating additional ghost objects in a target image.
Efficient Transformer Encoders for Mask2Former-style models
The third step is to use the aforementioned derived dataset to train a gating network that predicts the number of encoder layers to be used, conditioned on the input image.
Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions
When deploying pre-trained video object detectors in real-world scenarios, the domain gap between training and testing data caused by adverse image conditions often leads to performance degradation.
Gallbladder Cancer Detection in Ultrasound Images based on YOLO and Faster R-CNN
A fusion method that leverages the benefits of both techniques is presented in this study.
External Prompt Features Enhanced Parameter-efficient Fine-tuning for Salient Object Detection
To better harness the potential of transformers for SOD, we propose a novel parameter-efficient fine-tuning method aimed at reducing the number of training parameters while enhancing the salient object detection capability.
ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions
The fusion of multimodal sensor data streams such as camera images and lidar point clouds plays an important role in the operation of autonomous vehicles (AVs).
CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection
Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection.
CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective
Note that constraints on intra-sample similarities and inter-sample dissimilarities can be efficiently and effectively reformulated into a contrastive learning framework with newly designed positive and negative pairs.
NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation
As a preliminary work, NeRF-Det unifies the tasks of novel view synthesis and 3D perception, demonstrating that perceptual tasks can benefit from novel view synthesis methods like NeRF, significantly improving the performance of indoor multi-view 3D object detection.