Self-Supervised Learning for Large-Scale Unsupervised Image Clustering

24 Aug 2020  ·  Evgenii Zheltonozhskii, Chaim Baskin, Alex M. Bronstein, Avi Mendelson ·

Unsupervised learning has always been appealing to machine learning researchers and practitioners, allowing them to avoid an expensive and complicated process of labeling the data. However, unsupervised learning of complex data is challenging, and even the best approaches show much weaker performance than their supervised counterparts. Self-supervised deep learning has become a strong instrument for representation learning in computer vision. However, those methods have not been evaluated in a fully unsupervised setting. In this paper, we propose a simple scheme for unsupervised classification based on self-supervised representations. We evaluate the proposed approach with several recent self-supervised methods showing that it achieves competitive results for ImageNet classification (39% accuracy on ImageNet with 1000 clusters and 46% with overclustering). We suggest adding the unsupervised evaluation to a set of standard benchmarks for self-supervised learning. The code is available at https://github.com/Randl/kmeans_selfsuper

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Unsupervised Image Classification ImageNet SimCLRv2 ResNet-152 + SK (PCA+k-means) Accuracy (%) 39.07±0.61 # 6
ARI 22.80±0.60 # 6
Unsupervised Image Classification ImageNet SimCLRv2 ResNet-152 + SK (PCA+k-means, 1500 clusters) Accuracy (%) 46.03±0.21 # 1
ARI 23.94±0.16 # 5
Unsupervised Image Classification ObjectNet InfoMin ResNeXt-152 + SK (PCA+k-means) Accuracy (%) 6.53±0.19 # 1
ARI 1.59±0.04 # 1
Image Classification ObjectNet BigBiGAN (RevNet-50 4×) Top-1 Accuracy 4.92 # 105
Unsupervised Image Classification ObjectNet SimCLRv2 ResNet-152 + SK (PCA+k-means, 1500 clusters) Accuracy (%) 6.47±0.07 # 2
ARI 1.32±0.05 # 2

Methods


No methods listed for this paper. Add relevant methods here