Image Retrieval
667 papers with code • 54 benchmarks • 75 datasets
Image Retrieval is a fundamental and long-standing computer vision task that involves finding images similar to a provided query from a large database. It's often considered as a form of fine-grained, instance-level classification. Not just integral to image recognition alongside classification and detection, it also holds substantial business value by helping users discover images aligning with their interests or requirements, guided by visual similarity or other parameters.
( Image credit: DELF )
Libraries
Use these libraries to find Image Retrieval models and implementationsDatasets
Subtasks
Most implemented papers
12-in-1: Multi-Task Vision and Language Representation Learning
Much of vision-and-language research focuses on a small but diverse set of independent tasks and supporting datasets often studied in isolation; however, the visually-grounded language understanding skills required for success at these tasks overlap significantly.
Cross-Batch Memory for Embedding Learning
This suggests that the features of instances computed at preceding iterations can be used to considerably approximate their features extracted by the current model.
DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning
We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance.
DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features
Components orthogonal to the global image representation are then extracted from the local information.
CNN Features off-the-shelf: an Astounding Baseline for Recognition
We report on a series of experiments conducted for different recognition tasks using the publicly available code and model of the \overfeat network which was trained to perform object classification on ILSVRC13.
Improving zero-shot learning by mitigating the hubness problem
The zero-shot paradigm exploits vector-based word representations extracted from text corpora with unsupervised methods to learn general mapping functions from other feature spaces onto word space, where the words associated to the nearest neighbours of the mapped vectors are used as their linguistic labels.
End-to-end Learning of Deep Visual Representations for Image Retrieval
Second, we build on the recent R-MAC descriptor, show that it can be interpreted as a deep and differentiable architecture, and present improvements to enhance it.
A Discriminatively Learned CNN Embedding for Person Re-identification
We revisit two popular convolutional neural networks (CNN) in person re-identification (re-ID), i. e, verification and classification models.
Working hard to know your neighbor's margins: Local descriptor learning loss
We introduce a novel loss for learning local feature descriptors which is inspired by the Lowe's matching criterion for SIFT.
Composing Text and Image for Image Retrieval - An Empirical Odyssey
In this paper, we study the task of image retrieval, where the input query is specified in the form of an image plus some text that describes desired modifications to the input image.