Object Detection

3706 papers with code • 91 benchmarks • 257 datasets

Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:

  • One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.

  • Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.

The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.

( Image credit: Detectron )

Libraries

Use these libraries to find Object Detection models and implementations
64 papers
27,765
20 papers
2,917
See all 40 libraries.

Latest papers with no code

A Nasal Cytology Dataset for Object Detection and Deep Learning

no code yet • 21 Apr 2024

Nasal Cytology is a new and efficient clinical technique to diagnose rhinitis and allergies that is not much widespread due to the time-consuming nature of cell counting; that is why AI-aided counting could be a turning point for the diffusion of this technique.

Efficient and Concise Explanations for Object Detection with Gaussian-Class Activation Mapping Explainer

no code yet • 20 Apr 2024

To address the challenges of providing quick and plausible explanations in Explainable AI (XAI) for object detection models, we introduce the Gaussian Class Activation Mapping Explainer (G-CAME).

FisheyeDetNet: Object Detection on Fisheye Surround View Camera Systems for Automated Driving

no code yet • 20 Apr 2024

To the best of our knowledge, this is the first detailed study on object detection on fisheye cameras for autonomous driving scenarios.

Language-Driven Active Learning for Diverse Open-Set 3D Object Detection

no code yet • 19 Apr 2024

In this paper, we propose VisLED, a language-driven active learning framework for diverse open-set 3D Object Detection.

ECOR: Explainable CLIP for Object Recognition

no code yet • 19 Apr 2024

However, their black-box nature and lack of explainability in predictions make them less trustworthy in critical domains.

A Point-Based Approach to Efficient LiDAR Multi-Task Perception

no code yet • 19 Apr 2024

Unlike other LiDAR-based multi-task architectures, our proposed PAttFormer does not require separate feature encoders for multiple task-specific point cloud representations, resulting in a network that is 3x smaller and 1. 4x faster while achieving competitive performance on the nuScenes and KITTI benchmarks for autonomous driving perception.

ELEV-VISION-SAM: Integrated Vision Language and Foundation Model for Automated Estimation of Building Lowest Floor Elevation

no code yet • 19 Apr 2024

By evaluating various vision language models, integration methods, and text prompts, we identify the most suitable model for street view image analytics and LFE estimation tasks, thereby improving the availability of the current LFE estimation model based on image segmentation from 33% to 56% of properties.

Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

no code yet • 18 Apr 2024

Existing methods usually adopt a two-stage pipeline, where object proposals are first detected using a pretrained detector, and then are fed to an action recognition model for extracting video features and learning the object relations for action recognition.

Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions

no code yet • 17 Apr 2024

Our study introduces "Feature Corrective Transfer Learning", a novel approach that leverages transfer learning and a bespoke loss function to facilitate the end-to-end detection of objects in these challenging scenarios without the need to convert non-ideal images into their RGB counterparts.

Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness

no code yet • 17 Apr 2024

This paper is dedicated to bridging this gap by introducing Detector Collapse} (DC), a brand-new backdoor attack paradigm tailored for object detection.