Self-Supervised Image Classification

85 papers with code • 2 benchmarks • 1 datasets

This is the task of image classification using representations learnt with self-supervised learning. Self-supervised methods generally involve a pretext task that is solved to learn a good representation and a loss function to learn with. One example of a loss function is an autoencoder based loss where the goal is reconstruction of an image pixel-by-pixel. A more popular recent example is a contrastive loss, which measure the similarity of sample pairs in a representation space, and where there can be a varying target instead of a fixed target to reconstruct (as in the case of autoencoders).

A common evaluation protocol is to train a linear classifier on top of (frozen) representations learnt by self-supervised methods. The leaderboards for the linear evaluation protocol can be found below. In practice, it is more common to fine-tune features on a downstream task. An alternative evaluation protocol therefore uses semi-supervised learning and finetunes on a % of the labels. The leaderboards for the finetuning protocol can be accessed here.

You may want to read some blog posts before reading the papers and checking the leaderboards:

There is also Yann LeCun's talk at AAAI-20 which you can watch here (35:00+).

( Image credit: A Simple Framework for Contrastive Learning of Visual Representations )

Libraries

Use these libraries to find Self-Supervised Image Classification models and implementations
13 papers
2,740
12 papers
3,078
11 papers
3,227
See all 18 libraries.

Datasets


Latest papers with no code

Estimating Physical Information Consistency of Channel Data Augmentation for Remote Sensing Images

no code yet • 21 Mar 2024

The application of data augmentation for deep learning (DL) methods plays an important role in achieving state-of-the-art results in supervised, semi-supervised, and self-supervised image classification.

Perceptual Group Tokenizer: Building Perception with Iterative Grouping

no code yet • 30 Nov 2023

In this paper, we propose the Perceptual Group Tokenizer, a model that entirely relies on grouping operations to extract visual features and perform self-supervised representation learning, where a series of grouping operations are used to iteratively hypothesize the context for pixels or superpixels to refine feature representations.

DINO as a von Mises-Fisher mixture model

no code yet • ICLR 2023

With this interpretation, DINO assumes equal precision for all components when the prototypes are also $L^2$-normalized.

Masked Reconstruction Contrastive Learning with Information Bottleneck Principle

no code yet • 15 Nov 2022

In order to alleviate the discriminative information overfitting problem effectively, we employ the reconstruction task to regularize the discriminative task.

Relational Self-Supervised Learning

no code yet • 16 Mar 2022

Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations.

Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?

no code yet • 13 Jan 2022

Most notably, ReLICv2 is the first unsupervised representation learning method to consistently outperform the supervised baseline in a like-for-like comparison over a range of ResNet architectures.

Divide and Contrast: Self-supervised Learning from Uncurated Data

no code yet • ICCV 2021

Self-supervised learning holds promise in leveraging large amounts of unlabeled data, however much of its progress has thus far been limited to highly curated pre-training data such as ImageNet.

Large-Scale Unsupervised Person Re-Identification with Contrastive Learning

no code yet • 17 May 2021

In particular, most existing unsupervised and domain adaptation ReID methods utilize only the public datasets in their experiments, with labels removed.

Self-supervised Pre-training with Hard Examples Improves Visual Representations

no code yet • 25 Dec 2020

Self-supervised pre-training (SSP) employs random image transformations to generate training data for visual representation learning.

A Pseudo-labelling Auto-Encoder for unsupervised image classification

no code yet • 6 Dec 2020

In this paper, we introduce a unique variant of the denoising Auto-Encoder and combine it with the perceptual loss to classify images in an unsupervised manner.