NetVLAD: CNN architecture for weakly supervised place recognition

We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. We present the following three principal contributions. First, we develop a convolutional neural network (CNN) architecture that is trainable in an end-to-end manner directly for the place recognition task. The main component of this architecture, NetVLAD, is a new generalized VLAD layer, inspired by the "Vector of Locally Aggregated Descriptors" image representation commonly used in image retrieval. The layer is readily pluggable into any CNN architecture and amenable to training via backpropagation. Second, we develop a training procedure, based on a new weakly supervised ranking loss, to learn parameters of the architecture in an end-to-end manner from images depicting the same places over time downloaded from Google Street View Time Machine. Finally, we show that the proposed architecture significantly outperforms non-learnt image representations and off-the-shelf CNN descriptors on two challenging place recognition benchmarks, and improves over current state-of-the-art compact image representations on standard image retrieval benchmarks.

PDF Abstract CVPR 2016 PDF CVPR 2016 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Visual Place Recognition Baidu Mall NetVLAD Recall@1 53.10 # 4
Visual Place Recognition Berlin Kudamm NetVLAD Recall@1 38.21 # 4
Visual Place Recognition Pittsburgh-30k-test NetVLAD Recall@1 86.08 # 8

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Visual Place Recognition 17 Places NetVLAD Recall@1 61.58 # 5
Visual Place Recognition Gardens Point NetVLAD Recall@1 58.50 # 6
Visual Place Recognition Hawkins NetVLAD Recall@1 34.75 # 3
Visual Place Recognition Laurel Caverns NetVLAD Recall@1 39.29 # 4
Visual Place Recognition Mid-Atlantic Ridge NetVLAD Recall@1 25.74 # 3
Visual Place Recognition Nardo-Air NetVLAD Recall@1 19.72 # 6
Visual Place Recognition Nardo-Air R NetVLAD Recall@1 60.56 # 8
Visual Place Recognition Oxford RobotCar Dataset NetVLAD Recall@1 52.88 # 4
Visual Place Recognition St Lucia NetVLAD Recall@1 57.92 # 7
Visual Place Recognition VP-Air NetVLAD Recall@1 6.39 # 7

Methods


No methods listed for this paper. Add relevant methods here