Object Localization

231 papers with code • 18 benchmarks • 17 datasets

Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.

Source: Fast On-Line Kernel Density Estimation for Active Object Localization

Libraries

Use these libraries to find Object Localization models and implementations

Realistic Model Selection for Weakly Supervised Object Localization

shakeebmurtaza/wsol_model_selection 15 Apr 2024

Our experimental results with several WSOL methods on ILSVRC and CUB-200-2011 datasets show that our noisy boxes allow selecting models with performance close to those selected using ground truth boxes, and better than models selected using only image-class labels.

1
15 Apr 2024

FlightScope: A Deep Comprehensive Assessment of Aircraft Detection Algorithms in Satellite Imagery

toelt-llc/FlightScope_Bench 3 Apr 2024

Object detection in remotely sensed satellite pictures is fundamental in many fields such as biophysical, and environmental monitoring.

13
03 Apr 2024

IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models

csebuetnlp/illusionvqa 23 Mar 2024

GPT4V, the best-performing VLM, achieves 62. 99% accuracy (4-shot) on the comprehension task and 49. 7% on the localization task (4-shot and Chain-of-Thought).

3
23 Mar 2024

Few-shot Object Localization

ryh1218/fsol 19 Mar 2024

This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images.

4
19 Mar 2024

CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective

snskysk/cam-back-again 11 Mar 2024

The reason for the high-performance of large kernel CNNs in downstream tasks has been attributed to the large effective receptive field (ERF) produced by large size kernels, but this view has not been fully tested.

3
11 Mar 2024

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

opengvlab/all-seeing 29 Feb 2024

In addition, we design a new benchmark, termed Circular-based Relation Probing Evaluation (CRPE) for comprehensively evaluating the relation comprehension capabilities of MLLMs.

369
29 Feb 2024

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

tri-ml/prismatic-vlms 12 Feb 2024

Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new models such as LLaVa, InstructBLIP, and PaLI-3.

201
12 Feb 2024

Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data

Earth-Intelligence-Lab/vleo-bench 31 Jan 2024

Large Vision-Language Models (VLMs) have demonstrated impressive performance on complex tasks involving visual input with natural language instructions.

5
31 Jan 2024

CPR++: Object Localization via Single Coarse Point Supervision

ucas-vg/pointtinybenchmark 30 Jan 2024

CPR reduces the semantic variance by selecting a semantic centre point in a neighbourhood region to replace the initial annotated point.

632
30 Jan 2024

Spatial Structure Constraints for Weakly Supervised Semantic Segmentation

nust-machine-intelligence-laboratory/ssc 20 Jan 2024

In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.

5
20 Jan 2024