CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples

8 Apr 2016  ·  Filip Radenović, Giorgos Tolias, Ondřej Chum ·

Convolutional Neural Networks (CNNs) achieve state-of-the-art performance in many computer vision tasks. However, this achievement is preceded by extreme manual annotation in order to perform either training from scratch or fine-tuning for the target task. In this work, we propose to fine-tune CNN for image retrieval from a large collection of unordered images in a fully automated manner. We employ state-of-the-art retrieval and Structure-from-Motion (SfM) methods to obtain 3D models, which are used to guide the selection of the training data for CNN fine-tuning. We show that both hard positive and hard negative examples enhance the final performance in particular object retrieval with compact codes.

PDF Abstract

Datasets


Introduced in the Paper:

Retrieval-SfM

Used in the Paper:

Oxford5k Oxford105k

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Retrieval Oxf105k siaMAC+QE* MAP 77.9% # 6
Image Retrieval Oxf5k siaMAC+QE* MAP 82.9% # 7
Image Retrieval Par106k siaMAC+QE* mAP 78.3% # 6
Image Retrieval Par6k siaMAC+QE* mAP 85.6% # 5

Methods


No methods listed for this paper. Add relevant methods here