Self-Supervised Image Classification

85 papers with code • 2 benchmarks • 1 datasets

This is the task of image classification using representations learnt with self-supervised learning. Self-supervised methods generally involve a pretext task that is solved to learn a good representation and a loss function to learn with. One example of a loss function is an autoencoder based loss where the goal is reconstruction of an image pixel-by-pixel. A more popular recent example is a contrastive loss, which measure the similarity of sample pairs in a representation space, and where there can be a varying target instead of a fixed target to reconstruct (as in the case of autoencoders).

A common evaluation protocol is to train a linear classifier on top of (frozen) representations learnt by self-supervised methods. The leaderboards for the linear evaluation protocol can be found below. In practice, it is more common to fine-tune features on a downstream task. An alternative evaluation protocol therefore uses semi-supervised learning and finetunes on a % of the labels. The leaderboards for the finetuning protocol can be accessed here.

You may want to read some blog posts before reading the papers and checking the leaderboards:

There is also Yann LeCun's talk at AAAI-20 which you can watch here (35:00+).

( Image credit: A Simple Framework for Contrastive Learning of Visual Representations )

Libraries

Use these libraries to find Self-Supervised Image Classification models and implementations
13 papers
2,720
12 papers
3,062
11 papers
3,220
See all 18 libraries.

Datasets


MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations

ml-jku/MIM-Refiner 15 Feb 2024

The motivation behind MIM-Refiner is rooted in the insight that optimal representations within MIM models generally reside in intermediate layers.

21
15 Feb 2024

Masked Image Residual Learning for Scaling Deeper Vision Transformers

russellllaputa/MIRL NeurIPS 2023

With the same level of computational complexity as ViT-Base and ViT-Large, we instantiate 4. 5$\times$ and 2$\times$ deeper ViTs, dubbed ViT-S-54 and ViT-B-48.

17
25 Sep 2023

Masking Augmentation for Supervised Learning

naver-ai/augsub 20 Jun 2023

In this paper, we propose a novel way to involve masking augmentations dubbed Masked Sub-model (MaskSub).

32
20 Jun 2023

ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

modelscope/modelscope 18 May 2023

In this work, we explore a scalable way for building a general representation model toward unlimited modalities.

5,888
18 May 2023

Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

ml-jku/mae-ct 20 Apr 2023

In this work, we study how to combine the efficiency and scalability of MIM with the ability of ID to perform downstream classification in the absence of large amounts of labeled data.

27
20 Apr 2023

DINOv2: Learning Robust Visual Features without Supervision

huggingface/transformers 14 Apr 2023

The recent breakthroughs in natural language processing for model pretraining on large quantities of data have opened the way for similar foundation models in computer vision.

123,180
14 Apr 2023

Unicom: Universal and Compact Representation Learning for Image Retrieval

OML-Team/open-metric-learning 12 Apr 2023

To further enhance the low-dimensional feature representation, we randomly select partial feature dimensions when calculating the similarities between embeddings and class-wise prototypes.

749
12 Apr 2023

VNE: An Effective Method for Improving Deep Representation by Manipulating Eigenvalue Distribution

jaeill/CVPR23-VNE CVPR 2023

Since the introduction of deep learning, a wide scope of representation properties, such as decorrelation, whitening, disentanglement, rank, isotropy, and mutual information, have been studied to improve the quality of representation.

22
04 Apr 2023

All4One: Symbiotic Neighbour Contrastive Learning via Self-Attention and Redundancy Reduction

vturrisi/solo-learn ICCV 2023

Nearest neighbour based methods have proved to be one of the most successful self-supervised learning (SSL) approaches due to their high generalization capabilities.

1,348
16 Mar 2023

Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling

keyu-tian/spark 9 Jan 2023

This is the first use of sparse convolution for 2D masked modeling.

1,375
09 Jan 2023