Object Localization
231 papers with code • 18 benchmarks • 17 datasets
Object Localization is the task of locating an instance of a particular object category in an image, typically by specifying a tightly cropped bounding box centered on the instance. An object proposal specifies a candidate bounding box, and an object proposal is said to be a correct localization if it sufficiently overlaps a human-labeled “ground-truth” bounding box for the given object. In the literature, the “Object Localization” task is to locate one instance of an object category, whereas “object detection” focuses on locating all instances of a category in a given image.
Source: Fast On-Line Kernel Density Estimation for Active Object Localization
Libraries
Use these libraries to find Object Localization models and implementationsSubtasks
Latest papers with no code
Weakly Supervised Monocular 3D Detection with a Single-View Image
We propose SKD-WM3D, a weakly supervised monocular 3D detection framework that exploits depth information to achieve M3D with a single-view image exclusively without any 3D annotations or other training data.
Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration
First, a depth estimation (DE) scheme leverages relative depth information to realize the effective feature lifting from 2D to 3D spaces.
MsSVT++: Mixed-scale Sparse Voxel Transformer with Center Voting for 3D Object Detection
To mitigate the computational complexity associated with applying a window-based transformer in 3D voxel space, we introduce a novel Chessboard Sampling strategy and implement voxel sampling and gathering operations sparsely using a hash map.
Removal and Selection: Improving RGB-Infrared Object Detection via Coarse-to-Fine Fusion
Specifically, following this perspective, we design a Redundant Spectrum Removal module to coarsely remove interfering information within each modality and a Dynamic Feature Selection module to finely select the desired features for feature fusion.
Domain Adaptation for Large-Vocabulary Object Detectors
Large-vocabulary object detectors (LVDs) aim to detect objects of many categories, which learn super objectness features and can locate objects accurately while applied to various downstream data.
GTA: Guided Transfer of Spatial Attention from Object-Centric Representations
Through experimental analysis using attention maps in ViT, we observe that the rich representations deteriorate when trained on a small dataset.
FRED: Towards a Full Rotation-Equivariance in Aerial Image Object Detection
Compared to state-of-the-art methods, our proposed method delivers comparable performance on DOTA-v1. 0 and outperforms by 1. 5 mAP on DOTA-v1. 5, all while significantly reducing the model parameters to 16%.
Weakly Supervised Open-Vocabulary Object Detection
Despite weakly supervised object detection (WSOD) being a promising step toward evading strong instance-level annotations, its capability is confined to closed-set categories within a single training dataset.
Multiscale Vision Transformer With Deep Clustering-Guided Refinement for Weakly Supervised Object Localization
This work addresses the task of weakly-supervised object localization.
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection
Recently, the rise of query-based Transformer decoders is reshaping camera-based 3D object detection.