A Simple Framework for Contrastive Learning of Visual Representations

This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.

PDF Abstract ICML 2020 PDF
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Self-Supervised Image Classification ImageNet SimCLR (ResNet-50 4x) Top 1 Accuracy 76.5% # 57
Top 5 Accuracy 93.2% # 9
Number of Params 375M # 13
Self-Supervised Image Classification ImageNet SimCLR (ResNet-50 2x) Top 1 Accuracy 74.2% # 82
Top 5 Accuracy 92.0% # 15
Number of Params 94M # 29
Self-Supervised Image Classification ImageNet SimCLR (ResNet-50) Top 1 Accuracy 69.3% # 97
Top 5 Accuracy 89.0% # 26
Number of Params 24M # 48
Semi-Supervised Image Classification ImageNet - 10% labeled data SimCLR (ResNet-50 4×) Top 5 Accuracy 92.6% # 7
Semi-Supervised Image Classification ImageNet - 10% labeled data SimCLR (ResNet-50 2×) Top 5 Accuracy 91.2% # 17
Semi-Supervised Image Classification ImageNet - 10% labeled data SimCLR (ResNet-50) Top 5 Accuracy 87.8% # 30
Contrastive Learning imagenet-1k ResNet50 ImageNet Top-1 Accuracy 69.3 # 4
Semi-Supervised Image Classification ImageNet - 1% labeled data SimCLR (ResNet-50 2×) Top 5 Accuracy 83.0% # 17
Top 1 Accuracy 58.5% # 33
Semi-Supervised Image Classification ImageNet - 1% labeled data SimCLR (ResNet-50) Top 5 Accuracy 75.5% # 29
Top 1 Accuracy 48.3% # 44
Semi-Supervised Image Classification ImageNet - 1% labeled data SimCLR (ResNet-50 4×) Top 5 Accuracy 85.8% # 14
Top 1 Accuracy 63.0% # 26
Self-Supervised Image Classification ImageNet (finetuned) SimCLR (Resnet-50) Top 1 Accuracy 77.2% # 63
Image Classification Places205 SimCLR Top 1 Accuracy 53.3 # 13
Self-Supervised Person Re-Identification SYSU-30k SimCLR Rank-1 10.9 # 4
Person Re-Identification SYSU-30k SimCLR (self-supervised) Rank-1 10.9 # 8

Methods