Self-Supervised Image Classification

85 papers with code • 2 benchmarks • 1 datasets

This is the task of image classification using representations learnt with self-supervised learning. Self-supervised methods generally involve a pretext task that is solved to learn a good representation and a loss function to learn with. One example of a loss function is an autoencoder based loss where the goal is reconstruction of an image pixel-by-pixel. A more popular recent example is a contrastive loss, which measure the similarity of sample pairs in a representation space, and where there can be a varying target instead of a fixed target to reconstruct (as in the case of autoencoders).

A common evaluation protocol is to train a linear classifier on top of (frozen) representations learnt by self-supervised methods. The leaderboards for the linear evaluation protocol can be found below. In practice, it is more common to fine-tune features on a downstream task. An alternative evaluation protocol therefore uses semi-supervised learning and finetunes on a % of the labels. The leaderboards for the finetuning protocol can be accessed here.

You may want to read some blog posts before reading the papers and checking the leaderboards:

There is also Yann LeCun's talk at AAAI-20 which you can watch here (35:00+).

( Image credit: A Simple Framework for Contrastive Learning of Visual Representations )

Libraries

Use these libraries to find Self-Supervised Image Classification models and implementations
13 papers
2,756
12 papers
3,090
11 papers
3,230
See all 18 libraries.

Datasets


Most implemented papers

Revisiting Self-Supervised Visual Representation Learning

google/revisiting-self-supervised CVPR 2019

Unsupervised visual representation learning remains a largely unsolved problem in computer vision research.

Self-Supervised Learning with Swin Transformers

SwinTransformer/Transformer-SSL 10 May 2021

We are witnessing a modeling shift from CNN to Transformers in computer vision.

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

facebookresearch/vicreg NeurIPS 2021

Recent self-supervised methods for image representation learning are based on maximizing the agreement between embedding vectors from different views of the same image.

Context Autoencoder for Self-Supervised Representation Learning

atten4vis/cae 7 Feb 2022

The pretraining tasks include two tasks: masked representation prediction - predict the representations for the masked patches, and masked patch reconstruction - reconstruct the masked patches.

EVA: Exploring the Limits of Masked Visual Representation Learning at Scale

rwightman/pytorch-image-models CVPR 2023

We launch EVA, a vision-centric foundation model to explore the limits of visual representation at scale using only publicly accessible data.

Masked Feature Prediction for Self-Supervised Visual Pre-Training

facebookresearch/SlowFast CVPR 2022

We present Masked Feature Prediction (MaskFeat) for self-supervised pre-training of video models.

Unsupervised Feature Learning via Non-Parametric Instance Discrimination

zhirongw/lemniscate.pytorch CVPR 2018

Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so.

Data-Efficient Image Recognition with Contrastive Predictive Coding

philip-bachman/amdim-public ICML 2020

Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge.

Large Scale Adversarial Representation Learning

lukemelas/unsupervised-image-segmentation NeurIPS 2019

We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation.

Self-labelling via simultaneous clustering and representation learning

yukimasano/self-label ICLR 2020

Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks.