Object Localization

231 papers with code • 18 benchmarks • 17 datasets

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Libraries

Use these libraries to find Object Localization models and implementations

Bilateral Reference for High-Resolution Dichotomous Image Segmentation

zhengpeng7/birefnet 7 Jan 2024

It comprises two essential components: the localization module (LM) and the reconstruction module (RM) with our proposed bilateral reference (BiRef).

153
07 Jan 2024

LangSplat: 3D Language Gaussian Splatting

minghanqin/LangSplat 26 Dec 2023

Humans live in a 3D world and commonly use natural language to interact with a 3D scene.

394
26 Dec 2023

Dual Attention U-Net with Feature Infusion: Pushing the Boundaries of Multiclass Defect Segmentation

rashaalshawi/dual-attention-u-net-with-feature-infusion-pushing-the-boundaries-of-multiclass-defect-segmentation 21 Dec 2023

The proposed architecture, Dual Attentive U-Net with Feature Infusion (DAU-FI Net), addresses challenges in semantic segmentation, particularly on multiclass imbalanced datasets with limited samples.

9
21 Dec 2023

Object-Aware Domain Generalization for Object Detection

WoojuLee24/OA-DG 19 Dec 2023

To address these problems, we propose an object-aware domain generalization (OA-DG) method for single-domain generalization in object detection.

36
19 Dec 2023

Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

VinAIResearch/Open3DIS 17 Dec 2023

We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes.

19
17 Dec 2023

Mono3DVG: 3D Visual Grounding in Monocular Images

zhanyang-nwpu/mono3dvg 13 Dec 2023

To foster this task, we propose Mono3DVG-TR, an end-to-end transformer-based network, which takes advantage of both the appearance and geometry information in text embeddings for multi-modal learning and 3D object localization.

10
13 Dec 2023

Boosting Segment Anything Model Towards Open-Vocabulary Learning

ucas-vg/sambor 6 Dec 2023

The recent Segment Anything Model (SAM) has emerged as a new paradigmatic vision foundation model, showcasing potent zero-shot generalization and flexible prompting.

29
06 Dec 2023

Grounding Everything: Emerging Localization Properties in Vision-Language Transformers

walbouss/gem 1 Dec 2023

To leverage those capabilities, we propose a Grounding Everything Module (GEM) that generalizes the idea of value-value attention introduced by CLIPSurgery to a self-self attention path.

52
01 Dec 2023

Point, Segment and Count: A Generalized Framework for Object Counting

hzzone/pseco 21 Nov 2023

In this paper, we propose a generalized framework for both few-shot and zero-shot object counting based on detection.

53
21 Nov 2023

Towards Learning Monocular 3D Object Localization From 2D Labels using the Physical Laws of Motion

KieDani/Towards_3D_Object_Localization 26 Oct 2023

We present a novel method for precise 3D object localization in single images from a single calibrated camera using only 2D labels.

5
26 Oct 2023