Object Detection

3645 papers with code • 84 benchmarks • 251 datasets

Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:

  • One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.

  • Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.

The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.

( Image credit: Detectron )

Libraries

Use these libraries to find Object Detection models and implementations
64 papers
27,469
20 papers
2,911
See all 39 libraries.

Most implemented papers

MMDetection: Open MMLab Detection Toolbox and Benchmark

open-mmlab/mmdetection 17 Jun 2019

In this paper, we introduce the various features of this toolbox.

You Only Look Once: Unified, Real-Time Object Detection

AlexeyAB/darknet CVPR 2016

A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation.

CSPNet: A New Backbone that can Enhance Learning Capability of CNN

AlexeyAB/darknet 27 Nov 2019

Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection.

FCOS: Fully Convolutional One-Stage Object Detection

tianzhi0549/FCOS ICCV 2019

By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training.

Feature Pyramid Networks for Object Detection

PaddlePaddle/PaddleOCR CVPR 2017

Feature pyramids are a basic component in recognition systems for detecting objects at different scales.

Going Deeper with Convolutions

worksheets/0xbcd424d2 CVPR 2015

We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014).

Objects as Points

xingyizhou/CenterNet 16 Apr 2019

We model an object as a single point --- the center point of its bounding box.

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

microsoft/Swin-Transformer ICCV 2021

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

Frustum PointNets for 3D Object Detection from RGB-D Data

charlesq34/frustum-pointnets CVPR 2018

In this work, we study 3D object detection from RGB-D data in both indoor and outdoor scenes.

EfficientDet: Scalable and Efficient Object Detection

google/automl CVPR 2020

Model efficiency has become increasingly important in computer vision.