Object Localization

234 papers with code • 18 benchmarks • 17 datasets

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Benchmarks

Add a Result

These leaderboards are used to track progress in Object Localization

Dataset	Best Model	Compare
IllusionVQA	GPT4-Vision 4-shot+CoT	See all
KITTI Pedestrians Moderate	Frustrum-PointPillars	See all
KITTI Pedestrians Hard	Frustrum-PointPillars	See all
GRIT	Unified-IOXL	See all
KITTI Cars Easy	VoxelNet	See all
KITTI Cars Moderate	Frustum PointNets	See all
KITTI Cars Hard	VoxelNet	See all
KITTI Pedestrians Easy	Frustum PointNets	See all
KITTI Cyclists Easy	Frustum PointNets	See all
KITTI Cyclists Moderate	Frustum PointNets	See all
KITTI Cyclists Hard	Frustum PointNets	See all
Mall	Hausdorff Loss	See all
Pupil	Hausdorff Loss	See all
Plant	Hausdorff Loss	See all
PASCAL VOC 2007	DeepCut	See all
PASCAL VOC 2012	DeepCut	See all
KITTI Pedestrian Easy	Frustrum-PointPillars	See all
REVERIE	CoLabBUAA_MiNLP	See all

Show all 18 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Object Localization models and implementations

PaddlePaddle/PaddleDetection

3 papers

12,094

jacobgil/pytorch-grad-cam

3 papers

9,476

Westlake-AI/openmixup

2 papers

574

jediofgever/PointNet_Custom_Object_…

2 papers

See all 6 libraries.

Datasets

Subtasks

Monocular 3D Object Localization

Active Object Localization

Latest papers

Most implemented Social Latest No code

Grounding Everything: Emerging Localization Properties in Vision-Language Transformers

walbouss/gem • • 1 Dec 2023

To leverage those capabilities, we propose a Grounding Everything Module (GEM) that generalizes the idea of value-value attention introduced by CLIPSurgery to a self-self attention path.

01 Dec 2023

Paper
Code

Point, Segment and Count: A Generalized Framework for Object Counting

hzzone/pseco • • 21 Nov 2023

In this paper, we propose a generalized framework for both few-shot and zero-shot object counting based on detection.

21 Nov 2023

Paper
Code

Towards Learning Monocular 3D Object Localization From 2D Labels using the Physical Laws of Motion

KieDani/Towards_3D_Object_Localization • • 26 Oct 2023

We present a novel method for precise 3D object localization in single images from a single calibrated camera using only 2D labels.

26 Oct 2023

Paper
Code

Unsupervised Object Localization in the Era of Self-Supervised ViTs: A Survey

valeoai/awesome-unsupervised-object-localization • 19 Oct 2023

We propose here a survey of unsupervised object localization methods that discover objects in images without requiring any manual annotation in the era of self-supervised ViTs.

19 Oct 2023

Paper
Code

DiPS: Discriminative Pseudo-Label Sampling with Self-Supervised Transformers for Weakly Supervised Object Localization

shakeebmurtaza/dips • • 9 Oct 2023

Subsequently, these proposals are used as pseudo-labels to train our new transformer-based WSOL model designed to perform classification and localization tasks.

09 Oct 2023

Paper
Code

CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection

yangcaoai/CoDA_NeurIPS2023 • • NeurIPS 2023

Open-vocabulary 3D Object Detection (OV-3DDet) aims to detect objects from an arbitrary list of categories within a 3D scene, which remains seldom explored in the literature.

139

04 Oct 2023

Paper
Code

Learning to Terminate in Object Navigation

huskykingdom/dita_acml2023 • • 28 Sep 2023

This paper tackles the critical challenge of object navigation in autonomous navigation systems, particularly focusing on the problem of target approach and episode termination in environments with long optimal episode length in Deep Reinforcement Learning (DRL) based methods.

28 Sep 2023

Paper
Code

Context-Aware Entity Grounding with Open-Vocabulary 3D Scene Graphs

changhaonan/ovsg • • 27 Sep 2023

We present an Open-Vocabulary 3D Scene Graph (OVSG), a formal framework for grounding a variety of entities, such as object instances, agents, and regions, with free-form text-based queries.

27 Sep 2023

Paper
Code

CLIP-DIY: CLIP Dense Inference Yields Open-Vocabulary Semantic Segmentation For-Free

wysoczanska/clip-diy • • 25 Sep 2023

The emergence of CLIP has opened the way for open-world image perception.

25 Sep 2023

Paper
Code

Background Activation Suppression for Weakly Supervised Object Localization and Semantic Segmentation

wpy1999/bas • • 22 Sep 2023

In addition, our method also achieves state-of-the-art weakly supervised semantic segmentation performance on the PASCAL VOC 2012 and MS COCO 2014 datasets.

22 Sep 2023

Paper
Code

Object Localization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result