Object Detection
3706 papers with code • 91 benchmarks • 257 datasets
Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.
The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:
-
One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.
-
Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.
The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.
( Image credit: Detectron )
Libraries
Use these libraries to find Object Detection models and implementationsDatasets
Subtasks
- 3D Object Detection
- Real-Time Object Detection
- RGB Salient Object Detection
- Few-Shot Object Detection
- Few-Shot Object Detection
- Video Object Detection
- RGB-D Salient Object Detection
- Open Vocabulary Object Detection
- Object Detection In Aerial Images
- Weakly Supervised Object Detection
- Small Object Detection
- Robust Object Detection
- Medical Object Detection
- Zero-Shot Object Detection
- Open World Object Detection
- Co-Salient Object Detection
- Dense Object Detection
- Object Proposal Generation
- Video Salient Object Detection
- Camouflaged Object Segmentation
- License Plate Detection
- Head Detection
- Multiview Detection
- 3D Object Detection From Monocular Images
- One-Shot Object Detection
- Moving Object Detection
- Surgical tool detection
- Described Object Detection
- Body Detection
- Pupil Detection
- Object Detection In Indoor Scenes
- Class-agnostic Object Detection
- Semantic Part Detection
- Object Skeleton Detection
- Fish Detection
- Multiple Affordance Detection
- Weakly Supervised 3D Detection
Latest papers with no code
A Nasal Cytology Dataset for Object Detection and Deep Learning
Nasal Cytology is a new and efficient clinical technique to diagnose rhinitis and allergies that is not much widespread due to the time-consuming nature of cell counting; that is why AI-aided counting could be a turning point for the diffusion of this technique.
Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer
To address the challenges of providing quick and plausible explanations in Explainable AI (XAI) for object detection models, we introduce the Gaussian Class Activation Mapping Explainer (G-CAME).
FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving
To the best of our knowledge, this is the first detailed study on object detection on fisheye cameras for autonomous driving scenarios.
Language-Driven Active Learning for Diverse Open-Set 3D Object Detection
In this paper, we propose VisLED, a language-driven active learning framework for diverse open-set 3D Object Detection.
ECOR: Explainable CLIP for Object Recognition
However, their black-box nature and lack of explainability in predictions make them less trustworthy in critical domains.
A Point-Based Approach to Efficient LiDAR Multi-Task Perception
Unlike other LiDAR-based multi-task architectures, our proposed PAttFormer does not require separate feature encoders for multiple task-specific point cloud representations, resulting in a network that is 3x smaller and 1. 4x faster while achieving competitive performance on the nuScenes and KITTI benchmarks for autonomous driving perception.
ELEV-VISION-SAM: Integrated Vision Language and Foundation Model for Automated Estimation of Building Lowest Floor Elevation
By evaluating various vision language models, integration methods, and text prompts, we identify the most suitable model for street view image analytics and LFE estimation tasks, thereby improving the availability of the current LFE estimation model based on image segmentation from 33% to 56% of properties.
Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to an action recognition model for extracting video features and learning the object relations for action recognition.
Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions
Our study introduces "Feature Corrective Transfer Learning", a novel approach that leverages transfer learning and a bespoke loss function to facilitate the end-to-end detection of objects in these challenging scenarios without the need to convert non-ideal images into their RGB counterparts.
Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness
This paper is dedicated to bridging this gap by introducing Detector Collapse} (DC), a brand-new backdoor attack paradigm tailored for object detection.