Graph-RISE: Graph-Regularized Image Semantic Embedding

Learning image representations to capture fine-grained semantics has been a challenging and important task enabling many applications such as image search and clustering. In this paper, we present Graph-Regularized Image Semantic Embedding (Graph-RISE), a large-scale neural graph learning framework that allows us to train embeddings to discriminate an unprecedented O(40M) ultra-fine-grained semantic labels. Graph-RISE outperforms state-of-the-art image embedding algorithms on several evaluation tasks, including image classification and triplet ranking. We provide case studies to demonstrate that, qualitatively, image retrieval based on Graph-RISE effectively captures semantics and, compared to the state-of-the-art, differentiates nuances at levels that are closer to human-perception.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Classification ImageNet Graph-RISE (40M) Top 1 Accuracy 68.29% # 959
Image Classification iNaturalist Graph-RISE (40M) Top 1 Accuracy 31.12% # 11
Top 5 Accuracy 52.76% # 3

Methods