TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	ADE20K	TEC (Vit-B, Upernet)	Validation mIoU	51.0	# 96
Object Detection	COCO minival	TEC(VIT-B, Mask-RCNN)	box AP	54.6	# 52
Self-Supervised Image Classification	ImageNet (finetuned)	TEC_MAE (ViT-L/16, 224)	Top 1 Accuracy	86.5%	# 13
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT, mmseg)	mIoU (val)	63.2	# 1
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT, mmseg)	mIoU (test)	62.5	# 2
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT)	mIoU (val)	62.0	# 3
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL)	mIoU (val)	42.9	# 14
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL, mmseg)	mIoU (val)	46.1	# 13
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL, mmseg)	mIoU (test)	46.0	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-sustainable-self-supervised-learning/semantic-segmentation-on-imagenet-s)](https://paperswithcode.com/sota/semantic-segmentation-on-imagenet-s?p=towards-sustainable-self-supervised-learning)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-sustainable-self-supervised-learning/self-supervised-image-classification-on-1)](https://paperswithcode.com/sota/self-supervised-image-classification-on-1?p=towards-sustainable-self-supervised-learning)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-sustainable-self-supervised-learning/object-detection-on-coco-minival)](https://paperswithcode.com/sota/object-detection-on-coco-minival?p=towards-sustainable-self-supervised-learning)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/towards-sustainable-self-supervised-learning/semantic-segmentation-on-ade20k)](https://paperswithcode.com/sota/semantic-segmentation-on-ade20k?p=towards-sustainable-self-supervised-learning)`

Towards Sustainable Self-supervised Learning

20 Oct 2022 · ShangHua Gao, Pan Zhou, Ming-Ming Cheng, Shuicheng Yan ·

Although increasingly training-expensive, most self-supervised learning (SSL) models have repeatedly been trained from scratch but not fully utilized, since only a few SOTAs are employed for downstream tasks. In this work, we explore a sustainable SSL framework with two major challenges: i) learning a stronger new SSL model based on the existing pretrained SSL model, also called as "base" model, in a cost-friendly manner, ii) allowing the training of the new model to be compatible with various base models. We propose a Target-Enhanced Conditional (TEC) scheme which introduces two components to the existing mask-reconstruction based SSL. Firstly, we propose patch-relation enhanced targets which enhances the target given by base model and encourages the new model to learn semantic-relation knowledge from the base model by using incomplete inputs. This hardening and target-enhancing help the new model surpass the base model, since they enforce additional patch relation modeling to handle incomplete input. Secondly, we introduce a conditional adapter that adaptively adjusts new model prediction to align with the target of different base models. Extensive experimental results show that our TEC scheme can accelerate the learning speed, and also improve SOTA SSL base models, e.g., MAE and iBOT, taking an explorative step towards sustainable SSL.

PDF Abstract

Code

Add Remove Mark official

sail-sg/tec official

Tasks

Add Remove

Object Detection

Relation

Self-Supervised Image Classification

Self-Supervised Learning

Semantic Segmentation

Datasets

ImageNet

MS COCO

ADE20K

ImageNet-S

Results from the Paper

Edit

Ranked #1 on Semantic Segmentation on ImageNet-S

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	ADE20K	TEC (Vit-B, Upernet)	Validation mIoU	51.0	# 96	Compare
Object Detection	COCO minival	TEC(VIT-B, Mask-RCNN)	box AP	54.6	# 52	Compare
Self-Supervised Image Classification	ImageNet (finetuned)	TEC_MAE (ViT-L/16, 224)	Top 1 Accuracy	86.5%	# 13	Compare
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT, mmseg)	mIoU (val)	63.2	# 1	Compare
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT, mmseg)	mIoU (test)	62.5	# 2	Compare
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL+FT)	mIoU (val)	62.0	# 3	Compare
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL)	mIoU (val)	42.9	# 14	Compare
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL, mmseg)	mIoU (val)	46.1	# 13	Compare
Semantic Segmentation	ImageNet-S	TEC (ViT-B/16, 224x224, SSL, mmseg)	mIoU (test)	46.0	# 12	Compare

Methods

Add Remove

Adapter • ALIGN • BASE • MAE

Edit Social Preview

Towards Sustainable Self-supervised Learning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove