Hard negative examples are hard, but useful

Triplet loss is an extremely common approach to distance metric learning. Representations of images from the same class are optimized to be mapped closer together in an embedding space than representations of images from different classes. Much work on triplet losses focuses on selecting the most useful triplets of images to consider, with strategies that select dissimilar examples from the same class or similar examples from different classes. The consensus of previous research is that optimizing with the \textit{hardest} negative examples leads to bad training behavior. That's a problem -- these hardest negatives are literally the cases where the distance metric fails to capture semantic similarity. In this paper, we characterize the space of triplets and derive why hard negatives make triplet loss training fail. We offer a simple fix to the loss function and show that, with this fix, optimizing with hard negative examples becomes feasible. This leads to more generalizable features, and image retrieval results that outperform state of the art for datasets with high intra-class variance.

PDF Abstract ECCV 2020 PDF ECCV 2020 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Metric Learning CARS196 SCT(64) R@1 73.2 # 36
Metric Learning CUB-200-2011 SCT(64) R@1 57.7 # 28
Metric Learning In-Shop SCT(512) R@1 90 # 14
Metric Learning Stanford Online Products SCT(512) R@1 81.6 # 14

Methods