Image Retrieval
666 papers with code • 54 benchmarks • 75 datasets
Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside classification and detection, it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters.
( Image credit: DELF )
Libraries
Use these libraries to find Image Retrieval models and implementationsDatasets
Subtasks
Latest papers with no code
Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval
Composed Image Retrieval (CIR) is a task that retrieves images similar to a query, based on a provided textual modification.
Collaborative Visual Place Recognition through Federated Learning
Visual Place Recognition (VPR) aims to estimate the location of an image by treating it as a retrieval problem.
Shotit: compute-efficient image-to-video search engine for the cloud
We present Shotit, a cloud-native image-to-video search engine that tailors this search scenario in a compute-efficient approach.
Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
The Composed Image Retrieval (CIR) task aims to retrieve target images using a composed query consisting of a reference image and a modified text.
Spatial-Aware Image Retrieval: A Hyperdimensional Computing Approach for Efficient Similarity Hashing
Our work introduces a transformative image hashing framework enabling spatial-aware conditional retrieval.
Learning Embeddings with Centroid Triplet Loss for Object Identification in Robotic Grasping
A crucial practical aspect for an object identification model is to be flexible in input size.
Soft-Prompting with Graph-of-Thought for Multi-modal Representation Learning
It is a step-by-step linear reasoning process that adjusts the length of the chain to improve the performance of generated prompts.
On the Estimation of Image-matching Uncertainty in Visual Place Recognition
In Visual Place Recognition (VPR) the pose of a query image is estimated by comparing the image to a map of reference images with known reference poses.
Do Vision-Language Models Understand Compound Nouns?
We curate Compun, a novel benchmark with 400 unique and commonly used CNs, to evaluate the effectiveness of VLMs in interpreting CNs.
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Image retrieval, i. e., finding desired images given a reference image, inherently encompasses rich, multi-faceted search intents that are difficult to capture solely using image-based measures.