Object Localization

234 papers with code • 18 benchmarks • 17 datasets

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Libraries

Use these libraries to find Object Localization models and implementations

Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data

Earth-Intelligence-Lab/vleo-bench 31 Jan 2024

Large Vision-Language Models (VLMs) have demonstrated impressive performance on complex tasks involving visual input with natural language instructions.

5
31 Jan 2024

CPR++: Object Localization via Single Coarse Point Supervision

ucas-vg/TinyBenchmark 30 Jan 2024

CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.

636
30 Jan 2024

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

nust-machine-intelligence-laboratory/ssc 20 Jan 2024

In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.

5
20 Jan 2024

Bilateral Reference for High-Resolution Dichotomous Image Segmentation

zhengpeng7/birefnet 7 Jan 2024

It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef).

168
07 Jan 2024

LangSplat: 3D Language Gaussian Splatting

minghanqin/LangSplat 26 Dec 2023

Humans live in a 3D world and commonly use natural language to interact with a 3D scene.

403
26 Dec 2023

Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation

rashaalshawi/dual-attention-u-net-with-feature-infusion-pushing-the-boundaries-of-multiclass-defect-segmentation 21 Dec 2023

The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples.

9
21 Dec 2023

Object-Aware Domain Generalization for Object Detection

WoojuLee24/OA-DG 19 Dec 2023

To address these problems, we propose an object-aware domain generalization (OA-DG) method for single-domain generalization in object detection.

38
19 Dec 2023

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

VinAIResearch/Open3DIS 17 Dec 2023

We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes.

23
17 Dec 2023

Mono3DVG: 3D Visual Grounding in Monocular Images

zhanyang-nwpu/mono3dvg 13 Dec 2023

To foster this task, we propose Mono3DVG-TR, an end-to-end transformer-based network, which takes advantage of both the appearance and geometry information in text embeddings for multi-modal learning and 3D object localization.

12
13 Dec 2023

Boosting Segment Anything Model Towards Open-Vocabulary Learning

ucas-vg/sambor 6 Dec 2023

The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.

29
06 Dec 2023