Object Localization

234 papers with code • 18 benchmarks • 17 datasets

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Benchmarks

Add a Result

These leaderboards are used to track progress in Object Localization

Dataset	Best Model	Compare
IllusionVQA	GPT4-Vision 4-shot+CoT	See all
KITTI Pedestrians Moderate	Frustrum-PointPillars	See all
KITTI Pedestrians Hard	Frustrum-PointPillars	See all
GRIT	Unified-IOXL	See all
KITTI Cars Easy	VoxelNet	See all
KITTI Cars Moderate	Frustum PointNets	See all
KITTI Cars Hard	VoxelNet	See all
KITTI Pedestrians Easy	Frustum PointNets	See all
KITTI Cyclists Easy	Frustum PointNets	See all
KITTI Cyclists Moderate	Frustum PointNets	See all
KITTI Cyclists Hard	Frustum PointNets	See all
Mall	Hausdorff Loss	See all
Pupil	Hausdorff Loss	See all
Plant	Hausdorff Loss	See all
PASCAL VOC 2007	DeepCut	See all
PASCAL VOC 2012	DeepCut	See all
KITTI Pedestrian Easy	Frustrum-PointPillars	See all
REVERIE	CoLabBUAA_MiNLP	See all

Show all 18 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Object Localization models and implementations

PaddlePaddle/PaddleDetection

3 papers

12,095

jacobgil/pytorch-grad-cam

3 papers

9,481

Westlake-AI/openmixup

2 papers

574

jediofgever/PointNet_Custom_Object_…

2 papers

See all 6 libraries.

Datasets

Subtasks

Monocular 3D Object Localization

Active Object Localization

Latest papers

Most implemented Social Latest No code

Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data

Earth-Intelligence-Lab/vleo-bench • 31 Jan 2024

Large Vision-Language Models (VLMs) have demonstrated impressive performance on complex tasks involving visual input with natural language instructions.

31 Jan 2024

Paper
Code

CPR++: Object Localization via Single Coarse Point Supervision

ucas-vg/TinyBenchmark • • 30 Jan 2024

CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.

636

30 Jan 2024

Paper
Code

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

nust-machine-intelligence-laboratory/ssc • • 20 Jan 2024

In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.

20 Jan 2024

Paper
Code

Bilateral Reference for High-Resolution Dichotomous Image Segmentation

zhengpeng7/birefnet • • 7 Jan 2024

It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef).

168

07 Jan 2024

Paper
Code

LangSplat: 3D Language Gaussian Splatting

minghanqin/LangSplat • • 26 Dec 2023

Humans live in a 3D world and commonly use natural language to interact with a 3D scene.

403

26 Dec 2023

Paper
Code

Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation

rashaalshawi/dual-attention-u-net-with-feature-infusion-pushing-the-boundaries-of-multiclass-defect-segmentation • 21 Dec 2023

The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples.

21 Dec 2023

Paper
Code

Object-Aware Domain Generalization for Object Detection

WoojuLee24/OA-DG • • 19 Dec 2023

To address these problems, we propose an object-aware domain generalization (OA-DG) method for single-domain generalization in object detection.

19 Dec 2023

Paper
Code

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

VinAIResearch/Open3DIS • • 17 Dec 2023

We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes.

17 Dec 2023

Paper
Code

Mono3DVG: 3D Visual Grounding in Monocular Images

zhanyang-nwpu/mono3dvg • • 13 Dec 2023

To foster this task, we propose Mono3DVG-TR, an end-to-end transformer-based network, which takes advantage of both the appearance and geometry information in text embeddings for multi-modal learning and 3D object localization.

13 Dec 2023

Paper
Code

Boosting Segment Anything Model Towards Open-Vocabulary Learning

ucas-vg/sambor • 6 Dec 2023

The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.

06 Dec 2023

Paper
Code

Object Localization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result