Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition

Visual Place Recognition is a challenging task for robotics and autonomous systems, which must deal with the twin problems of appearance and viewpoint change in an always changing world. This paper introduces Patch-NetVLAD, which provides a novel formulation for combining the advantages of both local and global descriptor methods by deriving patch-level features from NetVLAD residuals. Unlike the fixed spatial neighborhood regime of existing local keypoint features, our method enables aggregation and matching of deep-learned local features defined over the feature-space grid. We further introduce a multi-scale fusion of patch features that have complementary scales (i.e. patch sizes) via an integral feature space and show that the fused features are highly invariant to both condition (season, structure, and illumination) and viewpoint (translation and rotation) changes. Patch-NetVLAD outperforms both global and local feature descriptor-based methods with comparable compute, achieving state-of-the-art visual place recognition results on a range of challenging real-world datasets, including winning the Facebook Mapillary Visual Place Recognition Challenge at ECCV2020. It is also adaptable to user requirements, with a speed-optimised version operating over an order of magnitude faster than the state-of-the-art. By combining superior performance with improved computational efficiency in a configurable framework, Patch-NetVLAD is well suited to enhance both stand-alone place recognition capabilities and the overall performance of SLAM systems.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Visual Localization Extended CMU Seasons Patch-NetVLAD Acc @ .25m, 2° 0.118 # 1
Acc @ .5m, 5° 0.362 # 1
Acc @ 5m, 10° 0.962 # 1
Visual Place Recognition Mapillary val Patch-NetVLAD Recall@1 79.5 # 8
Recall@5 86.2 # 9
Recall@10 87.7 # 10
Visual Place Recognition Nordland Patch-NetVLAD Recall@1 58.4 # 4
Recall@5 74.6 # 4
Recall@10 80.0 # 4
Visual Place Recognition Pittsburgh-30k-test Patch-NetVLAD Recall@1 88.7 # 6
Recall@5 94.5 # 5
Recall@10 95.9 # 4
Visual Localization RobotCar Seasons v2 Patch-NetVLAD Acc @ .25m, 2° 0.096 # 1
Acc @ .5m, 5° 0.353 # 1
Acc @ 5m, 10° 0.909 # 1
Visual Place Recognition Tokyo247 Patch-NetVLAD Recall@1 86 # 3
Recall@5 88.6 # 3
Recall@10 90.5 # 3

Methods


No methods listed for this paper. Add relevant methods here