Object Detection

3645 papers with code • 84 benchmarks • 251 datasets

Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:

  • One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.

  • Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.

The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.

( Image credit: Detectron )

Libraries

Use these libraries to find Object Detection models and implementations
64 papers
27,469
20 papers
2,911
See all 39 libraries.

Benchmarking Object Detectors with COCO: A New Path Forward

kdexd/coco-rem 27 Mar 2024

With these findings, we advocate using COCO-ReM for future object detection research.

3
27 Mar 2024

Ship in Sight: Diffusion Models for Ship-Image Super Resolution

luigisigillo/shipinsight 27 Mar 2024

In this context, our method explores in depth the problem of ship image super resolution, which is crucial for coastal and port surveillance.

1
27 Mar 2024

PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition

chenhongyiyang/plainmamba 26 Mar 2024

In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information.

14
26 Mar 2024

UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain Gaps

maxiuw/uada3d 26 Mar 2024

In this study, we address a gap in existing unsupervised domain adaptation approaches on LiDAR-based 3D object detection, which have predominantly concentrated on adapting between established, high-density autonomous driving datasets.

0
26 Mar 2024

Optimizing LiDAR Placements for Robust Driving Perception in Adverse Conditions

ywyeli/place3d 25 Mar 2024

The robustness of driving perception systems under unprecedented conditions is crucial for safety-critical usages.

11
25 Mar 2024

Multiple Object Tracking as ID Prediction

MCG-NJU/MOTIP 25 Mar 2024

In Multiple Object Tracking (MOT), tracking-by-detection methods have stood the test for a long time, which split the process into two parts according to the definition: object detection and association.

10
25 Mar 2024

RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection

vdigpku/rcbevdet 25 Mar 2024

In the dual-stream radar backbone, a point-based encoder and a transformer-based encoder are proposed to extract radar features, with an injection and extraction module to facilitate communication between the two encoders.

7
25 Mar 2024

FOOL: Addressing the Downlink Bottleneck in Satellite Computing with Neural Feature Compression

rezafuru/the-fool 25 Mar 2024

Further, it embeds context and leverages inter-tile dependencies to lower transfer costs with negligible overhead.

1
25 Mar 2024

SFOD: Spiking Fusion Object Detector

yimeng-fan/SFOD 22 Mar 2024

Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93. 7\% accuracy on the NCAR dataset.

8
22 Mar 2024

IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection

yinjunbo/is-fusion 22 Mar 2024

HSF applies Point-to-Grid and Grid-to-Region transformers to capture the multimodal scene context at different granularities.

7
22 Mar 2024