TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	ImageNet	CAFormer-B36 (384 res)	Top 1 Accuracy	86.4%	# 143
Image Classification	ImageNet	CAFormer-B36 (384 res)	Number of params	99M	# 861
Image Classification	ImageNet	CAFormer-B36 (384 res)	GFLOPs	72.2	# 440
Image Classification	ImageNet	ConvFormer-M36 (224 res, 21K)	Top 1 Accuracy	86.1%	# 170
Image Classification	ImageNet	ConvFormer-M36 (224 res, 21K)	Number of params	57M	# 755
Image Classification	ImageNet	ConvFormer-M36 (224 res, 21K)	GFLOPs	12.8	# 320
Image Classification	ImageNet	CAFormer-M36 (224 res, 21K)	Top 1 Accuracy	86.6%	# 132
Image Classification	ImageNet	CAFormer-M36 (224 res, 21K)	Number of params	56M	# 748
Image Classification	ImageNet	CAFormer-M36 (224 res, 21K)	GFLOPs	13.2	# 324
Image Classification	ImageNet	CAFormer-S36 (384 res, 21K)	Top 1 Accuracy	86.9%	# 115
Image Classification	ImageNet	CAFormer-S36 (384 res, 21K)	Number of params	39M	# 666
Image Classification	ImageNet	CAFormer-S36 (384 res, 21K)	GFLOPs	26.0	# 383
Image Classification	ImageNet	CAFormer-S36 (224 res, 21K)	Top 1 Accuracy	85.8%	# 187
Image Classification	ImageNet	CAFormer-S36 (224 res, 21K)	Number of params	39M	# 666
Image Classification	ImageNet	CAFormer-S36 (224 res, 21K)	GFLOPs	8.0	# 267
Image Classification	ImageNet	ConvFormer-S36 (384 res, 21K)	Top 1 Accuracy	86.4%	# 143
Image Classification	ImageNet	ConvFormer-S36 (384 res, 21K)	Number of params	40M	# 678
Image Classification	ImageNet	ConvFormer-S36 (384 res, 21K)	GFLOPs	22.4	# 371
Image Classification	ImageNet	ConvFormer-S36 (224 res, 21K)	Top 1 Accuracy	85.4%	# 221
Image Classification	ImageNet	ConvFormer-S36 (224 res, 21K)	Number of params	40M	# 678
Image Classification	ImageNet	ConvFormer-S36 (224 res, 21K)	GFLOPs	7.6	# 258
Image Classification	ImageNet	CAFormer-S18 (384 res, 21K)	Top 1 Accuracy	85.4%	# 221
Image Classification	ImageNet	CAFormer-S18 (384 res, 21K)	Number of params	26M	# 607
Image Classification	ImageNet	CAFormer-S18 (384 res, 21K)	GFLOPs	13.4	# 327
Image Classification	ImageNet	CAFormer-S18 (224 res, 21K)	Top 1 Accuracy	84.1%	# 325
Image Classification	ImageNet	CAFormer-S18 (224 res, 21K)	Number of params	26M	# 607
Image Classification	ImageNet	CAFormer-S18 (224 res, 21K)	GFLOPs	4.1	# 196
Image Classification	ImageNet	ConvFormer-S18 (384 res, 21K)	Top 1 Accuracy	85.0%	# 255
Image Classification	ImageNet	ConvFormer-S18 (384 res, 21K)	Number of params	27M	# 615
Image Classification	ImageNet	ConvFormer-S18 (384 res, 21K)	GFLOPs	11.6	# 311
Image Classification	ImageNet	ConvFormer-S18 (224 res, 21K)	Top 1 Accuracy	83.7%	# 365
Image Classification	ImageNet	ConvFormer-S18 (224 res, 21K)	Number of params	27M	# 615
Image Classification	ImageNet	ConvFormer-S18 (224 res, 21K)	GFLOPs	3.9	# 189
Image Classification	ImageNet	CAFormer-S18 (224 res)	Top 1 Accuracy	83.6%	# 378
Image Classification	ImageNet	CAFormer-S18 (224 res)	Number of params	26M	# 607
Image Classification	ImageNet	CAFormer-S18 (224 res)	GFLOPs	4.1	# 196
Image Classification	ImageNet	ConvFormer-B36 (224 res, 21K)	Top 1 Accuracy	87.0%	# 112
Image Classification	ImageNet	ConvFormer-B36 (224 res, 21K)	Number of params	100M	# 868
Image Classification	ImageNet	ConvFormer-B36 (224 res, 21K)	GFLOPs	22.6	# 373
Image Classification	ImageNet	ConvFormer-B36 (384 res, 21K)	Top 1 Accuracy	87.6%	# 84
Image Classification	ImageNet	ConvFormer-B36 (384 res, 21K)	Number of params	100M	# 868
Image Classification	ImageNet	ConvFormer-B36 (384 res, 21K)	GFLOPs	66.5	# 436
Image Classification	ImageNet	CAFormer-M36 (384 res)	Top 1 Accuracy	86.2%	# 164
Image Classification	ImageNet	CAFormer-M36 (384 res)	Number of params	56M	# 748
Image Classification	ImageNet	CAFormer-M36 (384 res)	GFLOPs	42.0	# 413
Image Classification	ImageNet	CAFormer-S36 (384 res)	Top 1 Accuracy	85.7%	# 200
Image Classification	ImageNet	CAFormer-S36 (384 res)	Number of params	39M	# 666
Image Classification	ImageNet	CAFormer-S36 (384 res)	GFLOPs	26.0	# 383
Image Classification	ImageNet	CAFormer-S36 (224 res)	Top 1 Accuracy	84.5%	# 293
Image Classification	ImageNet	CAFormer-S36 (224 res)	Number of params	39M	# 666
Image Classification	ImageNet	CAFormer-S36 (224 res)	GFLOPs	8.0	# 267
Image Classification	ImageNet	ConvFormer-S18 (224 res)	Top 1 Accuracy	83.0%	# 437
Image Classification	ImageNet	ConvFormer-S18 (224 res)	Number of params	27M	# 615
Image Classification	ImageNet	ConvFormer-S18 (224 res)	GFLOPs	3.9	# 189
Image Classification	ImageNet	ConvFormer-S36 (224 res)	Top 1 Accuracy	84.1%	# 325
Image Classification	ImageNet	ConvFormer-S36 (224 res)	Number of params	40M	# 678
Image Classification	ImageNet	ConvFormer-S36 (224 res)	GFLOPs	7.6	# 258
Image Classification	ImageNet	ConvFormer-S18 (384 res)	Top 1 Accuracy	84.4%	# 299
Image Classification	ImageNet	ConvFormer-S18 (384 res)	Number of params	27M	# 615
Image Classification	ImageNet	ConvFormer-S18 (384 res)	GFLOPs	11.6	# 311
Image Classification	ImageNet	ConvFormer-M36 (224 res)	Top 1 Accuracy	84.5%	# 293
Image Classification	ImageNet	ConvFormer-M36 (224 res)	Number of params	57M	# 755
Image Classification	ImageNet	ConvFormer-M36 (224 res)	GFLOPs	12.8	# 320
Image Classification	ImageNet	CAFormer-S18 (384 res)	Top 1 Accuracy	85.0%	# 255
Image Classification	ImageNet	CAFormer-S18 (384 res)	Number of params	26M	# 607
Image Classification	ImageNet	CAFormer-S18 (384 res)	GFLOPs	13.4	# 327
Image Classification	ImageNet	CAFormer-M36 (224 res)	Top 1 Accuracy	85.2%	# 239
Image Classification	ImageNet	CAFormer-M36 (224 res)	Number of params	56M	# 748
Image Classification	ImageNet	CAFormer-M36 (224 res)	GFLOPs	13.2	# 324
Image Classification	ImageNet	ConvFormer-S36 (384 res)	Top 1 Accuracy	85.4%	# 221
Image Classification	ImageNet	ConvFormer-S36 (384 res)	Number of params	40M	# 678
Image Classification	ImageNet	ConvFormer-S36 (384 res)	GFLOPs	22.4	# 371
Image Classification	ImageNet	ConvFormer-M36 (384 res)	Top 1 Accuracy	85.6%	# 209
Image Classification	ImageNet	ConvFormer-M36 (384 res)	Number of params	57M	# 755
Image Classification	ImageNet	ConvFormer-M36 (384 res)	GFLOPs	37.7	# 407
Image Classification	ImageNet	ConvFormer-B36 (224 res)	Top 1 Accuracy	84.8%	# 270
Image Classification	ImageNet	ConvFormer-B36 (224 res)	Number of params	100M	# 868
Image Classification	ImageNet	ConvFormer-B36 (224 res)	GFLOPs	22.6	# 373
Image Classification	ImageNet	CAFormer-B36 (224 res)	Top 1 Accuracy	85.5%	# 212
Image Classification	ImageNet	CAFormer-B36 (224 res)	Number of params	99M	# 861
Image Classification	ImageNet	CAFormer-B36 (224 res)	GFLOPs	23.2	# 375
Image Classification	ImageNet	ConvFormer-B36 (384 res)	Top 1 Accuracy	85.7%	# 200
Image Classification	ImageNet	ConvFormer-B36 (384 res)	Number of params	100M	# 868
Image Classification	ImageNet	ConvFormer-B36 (384 res)	GFLOPs	66.5	# 436
Image Classification	ImageNet	CAFormer-B36 (384 res, 21K)	Top 1 Accuracy	88.1%	# 67
Image Classification	ImageNet	CAFormer-B36 (384 res, 21K)	Number of params	99M	# 861
Image Classification	ImageNet	CAFormer-B36 (384 res, 21K)	GFLOPs	72.2	# 440
Image Classification	ImageNet	CAFormer-B36 (224 res, 21K)	Top 1 Accuracy	87.4%	# 93
Image Classification	ImageNet	CAFormer-B36 (224 res, 21K)	Number of params	99M	# 861
Image Classification	ImageNet	CAFormer-B36 (224 res, 21K)	GFLOPs	23.2	# 375
Image Classification	ImageNet	CAFormer-M36 (384 res, 21K)	Top 1 Accuracy	87.5%	# 86
Image Classification	ImageNet	CAFormer-M36 (384 res, 21K)	Number of params	56M	# 748
Image Classification	ImageNet	CAFormer-M36 (384 res, 21K)	GFLOPs	42	# 413
Image Classification	ImageNet	ConvFormer-M36 (384 res, 21K)	Top 1 Accuracy	86.9%	# 115
Image Classification	ImageNet	ConvFormer-M36 (384 res, 21K)	Number of params	57M	# 755
Image Classification	ImageNet	ConvFormer-M36 (384 res, 21K)	GFLOPs	37.7	# 407
Domain Generalization	ImageNet-A	ConvFormer-B36 (384)	Top-1 accuracy %	55.3	# 17
Domain Generalization	ImageNet-A	CAFormer-B36 (IN-21K)	Top-1 accuracy %	69.4	# 9
Domain Generalization	ImageNet-A	CAFormer-B36 (IN-21K, 384)	Top-1 accuracy %	79.5	# 5
Domain Generalization	ImageNet-A	ConvFormer-B36 (IN-21K)	Top-1 accuracy %	63.3	# 12
Domain Generalization	ImageNet-A	ConvFormer-B36 (IN-21K, 384)	Top-1 accuracy %	73.5	# 8
Domain Generalization	ImageNet-A	CAFormer-B36	Top-1 accuracy %	48.5	# 20
Domain Generalization	ImageNet-A	CAFormer-B36 (384)	Top-1 accuracy %	61.9	# 14
Domain Generalization	ImageNet-A	ConvFormer-B36	Top-1 accuracy %	40.1	# 23
Domain Generalization	ImageNet-C	CAFormer-B36 (IN21K, 384)	mean Corruption Error (mCE)	30.8	# 2
Domain Generalization	ImageNet-C	CAFormer-B36	mean Corruption Error (mCE)	42.6	# 18
Domain Generalization	ImageNet-C	ConvFormer-B36 (IN21K)	mean Corruption Error (mCE)	35.0	# 7
Domain Generalization	ImageNet-C	CAFormer-B36 (IN21K)	mean Corruption Error (mCE)	31.8	# 5
Domain Generalization	ImageNet-C	ConvFormer-B36	mean Corruption Error (mCE)	46.3	# 23
Domain Generalization	ImageNet-R	CAFormer-B36 (IN21K, 384)	Top-1 Error Rate	29.6	# 5
Domain Generalization	ImageNet-R	CAFormer-B36 (IN21K)	Top-1 Error Rate	31.7	# 7
Domain Generalization	ImageNet-R	ConvFormer-B36	Top-1 Error Rate	48.9	# 25
Domain Generalization	ImageNet-R	ConvFormer-B36 (384)	Top-1 Error Rate	47.8	# 24
Domain Generalization	ImageNet-R	CAFormer-B36 (384)	Top-1 Error Rate	45	# 21
Domain Generalization	ImageNet-R	CAFormer-B36	Top-1 Error Rate	46.1	# 23
Domain Generalization	ImageNet-R	ConvFormer-B36 (IN21K, 384)	Top-1 Error Rate	33.5	# 10
Domain Generalization	ImageNet-R	ConvFormer-B36 (IN21K)	Top-1 Error Rate	34.7	# 13
Domain Generalization	ImageNet-Sketch	ConvFormer-B36 (IN21K, 384)	Top-1 accuracy	52.9	# 7
Domain Generalization	ImageNet-Sketch	CAFormer-B36	Top-1 accuracy	42.5	# 17
Domain Generalization	ImageNet-Sketch	ConvFormer-B36	Top-1 accuracy	39.5	# 19
Domain Generalization	ImageNet-Sketch	CAFormer-B36 (IN21K, 384)	Top-1 accuracy	54.5	# 5
Domain Generalization	ImageNet-Sketch	ConvFormer-B36 (IN21K)	Top-1 accuracy	52.7	# 9
Domain Generalization	ImageNet-Sketch	CAFormer-B36 (IN21K)	Top-1 accuracy	52.8	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metaformer-baselines-for-vision/domain-generalization-on-imagenet-c)](https://paperswithcode.com/sota/domain-generalization-on-imagenet-c?p=metaformer-baselines-for-vision)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metaformer-baselines-for-vision/domain-generalization-on-imagenet-a)](https://paperswithcode.com/sota/domain-generalization-on-imagenet-a?p=metaformer-baselines-for-vision)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metaformer-baselines-for-vision/domain-generalization-on-imagenet-r)](https://paperswithcode.com/sota/domain-generalization-on-imagenet-r?p=metaformer-baselines-for-vision)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metaformer-baselines-for-vision/domain-generalization-on-imagenet-sketch)](https://paperswithcode.com/sota/domain-generalization-on-imagenet-sketch?p=metaformer-baselines-for-vision)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/metaformer-baselines-for-vision/image-classification-on-imagenet)](https://paperswithcode.com/sota/image-classification-on-imagenet?p=metaformer-baselines-for-vision)`

MetaFormer Baselines for Vision

24 Oct 2022 · Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang ·

MetaFormer, the abstracted architecture of Transformer, has been found to play a significant role in achieving competitive performance. In this paper, we further explore the capacity of MetaFormer, again, without focusing on token mixer design: we introduce several baseline models under MetaFormer using the most basic or common mixers, and summarize our observations as follows: (1) MetaFormer ensures solid lower bound of performance. By merely adopting identity mapping as the token mixer, the MetaFormer model, termed IdentityFormer, achieves >80% accuracy on ImageNet-1K. (2) MetaFormer works well with arbitrary token mixers. When specifying the token mixer as even a random matrix to mix tokens, the resulting model RandFormer yields an accuracy of >81%, outperforming IdentityFormer. Rest assured of MetaFormer's results when new token mixers are adopted. (3) MetaFormer effortlessly offers state-of-the-art results. With just conventional token mixers dated back five years ago, the models instantiated from MetaFormer already beat state of the art. (a) ConvFormer outperforms ConvNeXt. Taking the common depthwise separable convolutions as the token mixer, the model termed ConvFormer, which can be regarded as pure CNNs, outperforms the strong CNN model ConvNeXt. (b) CAFormer sets new record on ImageNet-1K. By simply applying depthwise separable convolutions as token mixer in the bottom stages and vanilla self-attention in the top stages, the resulting model CAFormer sets a new record on ImageNet-1K: it achieves an accuracy of 85.5% at 224x224 resolution, under normal supervised training without external data or distillation. In our expedition to probe MetaFormer, we also find that a new activation, StarReLU, reduces 71% FLOPs of activation compared with GELU yet achieves better performance. We expect StarReLU to find great potential in MetaFormer-like models alongside other neural networks.

PDF Abstract

Code

Add Remove Mark official

rwightman/pytorch-image-models official

29,828

sail-sg/metaformer official

↳ Quickstart in

Colab

360

facebookresearch/xformers

↳ Quickstart in

Colab

7,624

sail-sg/poolformer

↳ Quickstart in

Colab

Spaces

1,241

Westlake-AI/openmixup

574

See all 7 implementations

Tasks

Add Remove

Domain Generalization

Image Classification

Datasets

ImageNet

MS COCO

ADE20K ImageNet-1K

ImageNet-C

ImageNet-R

ImageNet-A

ImageNet-Sketch

Results from the Paper

Edit

Ranked #2 on Domain Generalization on ImageNet-C (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	ImageNet	CAFormer-B36 (384 res)	Top 1 Accuracy	86.4%	# 143	Compare
			Number of params	99M	# 861	Compare
			GFLOPs	72.2	# 440	Compare
Image Classification	ImageNet	ConvFormer-M36 (224 res, 21K)	Top 1 Accuracy	86.1%	# 170	Compare
			Number of params	57M	# 755	Compare
			GFLOPs	12.8	# 320	Compare
Image Classification	ImageNet	CAFormer-M36 (224 res, 21K)	Top 1 Accuracy	86.6%	# 132	Compare
			Number of params	56M	# 748	Compare
			GFLOPs	13.2	# 324	Compare
Image Classification	ImageNet	CAFormer-S36 (384 res, 21K)	Top 1 Accuracy	86.9%	# 115	Compare
			Number of params	39M	# 666	Compare
			GFLOPs	26.0	# 383	Compare
Image Classification	ImageNet	CAFormer-S36 (224 res, 21K)	Top 1 Accuracy	85.8%	# 187	Compare
			Number of params	39M	# 666	Compare
			GFLOPs	8.0	# 267	Compare
Image Classification	ImageNet	ConvFormer-S36 (384 res, 21K)	Top 1 Accuracy	86.4%	# 143	Compare
			Number of params	40M	# 678	Compare
			GFLOPs	22.4	# 371	Compare
Image Classification	ImageNet	ConvFormer-S36 (224 res, 21K)	Top 1 Accuracy	85.4%	# 221	Compare
			Number of params	40M	# 678	Compare
			GFLOPs	7.6	# 258	Compare
Image Classification	ImageNet	CAFormer-S18 (384 res, 21K)	Top 1 Accuracy	85.4%	# 221	Compare
			Number of params	26M	# 607	Compare
			GFLOPs	13.4	# 327	Compare
Image Classification	ImageNet	CAFormer-S18 (224 res, 21K)	Top 1 Accuracy	84.1%	# 325	Compare
			Number of params	26M	# 607	Compare
			GFLOPs	4.1	# 196	Compare
Image Classification	ImageNet	ConvFormer-S18 (384 res, 21K)	Top 1 Accuracy	85.0%	# 255	Compare
			Number of params	27M	# 615	Compare
			GFLOPs	11.6	# 311	Compare
Image Classification	ImageNet	ConvFormer-S18 (224 res, 21K)	Top 1 Accuracy	83.7%	# 365	Compare
			Number of params	27M	# 615	Compare
			GFLOPs	3.9	# 189	Compare
Image Classification	ImageNet	CAFormer-S18 (224 res)	Top 1 Accuracy	83.6%	# 378	Compare
			Number of params	26M	# 607	Compare
			GFLOPs	4.1	# 196	Compare
Image Classification	ImageNet	ConvFormer-B36 (224 res, 21K)	Top 1 Accuracy	87.0%	# 112	Compare
			Number of params	100M	# 868	Compare
			GFLOPs	22.6	# 373	Compare
Image Classification	ImageNet	ConvFormer-B36 (384 res, 21K)	Top 1 Accuracy	87.6%	# 84	Compare
			Number of params	100M	# 868	Compare
			GFLOPs	66.5	# 436	Compare
Image Classification	ImageNet	CAFormer-M36 (384 res)	Top 1 Accuracy	86.2%	# 164	Compare
			Number of params	56M	# 748	Compare
			GFLOPs	42.0	# 413	Compare
Image Classification	ImageNet	CAFormer-S36 (384 res)	Top 1 Accuracy	85.7%	# 200	Compare
			Number of params	39M	# 666	Compare
			GFLOPs	26.0	# 383	Compare
Image Classification	ImageNet	CAFormer-S36 (224 res)	Top 1 Accuracy	84.5%	# 293	Compare
			Number of params	39M	# 666	Compare
			GFLOPs	8.0	# 267	Compare
Image Classification	ImageNet	ConvFormer-S18 (224 res)	Top 1 Accuracy	83.0%	# 437	Compare
			Number of params	27M	# 615	Compare
			GFLOPs	3.9	# 189	Compare
Image Classification	ImageNet	ConvFormer-S36 (224 res)	Top 1 Accuracy	84.1%	# 325	Compare
			Number of params	40M	# 678	Compare
			GFLOPs	7.6	# 258	Compare
Image Classification	ImageNet	ConvFormer-S18 (384 res)	Top 1 Accuracy	84.4%	# 299	Compare
			Number of params	27M	# 615	Compare
			GFLOPs	11.6	# 311	Compare
Image Classification	ImageNet	ConvFormer-M36 (224 res)	Top 1 Accuracy	84.5%	# 293	Compare
			Number of params	57M	# 755	Compare
			GFLOPs	12.8	# 320	Compare
Image Classification	ImageNet	CAFormer-S18 (384 res)	Top 1 Accuracy	85.0%	# 255	Compare
			Number of params	26M	# 607	Compare
			GFLOPs	13.4	# 327	Compare
Image Classification	ImageNet	CAFormer-M36 (224 res)	Top 1 Accuracy	85.2%	# 239	Compare
			Number of params	56M	# 748	Compare
			GFLOPs	13.2	# 324	Compare
Image Classification	ImageNet	ConvFormer-S36 (384 res)	Top 1 Accuracy	85.4%	# 221	Compare
			Number of params	40M	# 678	Compare
			GFLOPs	22.4	# 371	Compare
Image Classification	ImageNet	ConvFormer-M36 (384 res)	Top 1 Accuracy	85.6%	# 209	Compare
			Number of params	57M	# 755	Compare
			GFLOPs	37.7	# 407	Compare
Image Classification	ImageNet	ConvFormer-B36 (224 res)	Top 1 Accuracy	84.8%	# 270	Compare
			Number of params	100M	# 868	Compare
			GFLOPs	22.6	# 373	Compare
Image Classification	ImageNet	CAFormer-B36 (224 res)	Top 1 Accuracy	85.5%	# 212	Compare
			Number of params	99M	# 861	Compare
			GFLOPs	23.2	# 375	Compare
Image Classification	ImageNet	ConvFormer-B36 (384 res)	Top 1 Accuracy	85.7%	# 200	Compare
			Number of params	100M	# 868	Compare
			GFLOPs	66.5	# 436	Compare
Image Classification	ImageNet	CAFormer-B36 (384 res, 21K)	Top 1 Accuracy	88.1%	# 67	Compare
			Number of params	99M	# 861	Compare
			GFLOPs	72.2	# 440	Compare
Image Classification	ImageNet	CAFormer-B36 (224 res, 21K)	Top 1 Accuracy	87.4%	# 93	Compare
			Number of params	99M	# 861	Compare
			GFLOPs	23.2	# 375	Compare
Image Classification	ImageNet	CAFormer-M36 (384 res, 21K)	Top 1 Accuracy	87.5%	# 86	Compare
			Number of params	56M	# 748	Compare
			GFLOPs	42	# 413	Compare
Image Classification	ImageNet	ConvFormer-M36 (384 res, 21K)	Top 1 Accuracy	86.9%	# 115	Compare
			Number of params	57M	# 755	Compare
			GFLOPs	37.7	# 407	Compare
Domain Generalization	ImageNet-A	ConvFormer-B36 (384)	Top-1 accuracy %	55.3	# 17	Compare
Domain Generalization	ImageNet-A	CAFormer-B36 (IN-21K)	Top-1 accuracy %	69.4	# 9	Compare
Domain Generalization	ImageNet-A	CAFormer-B36 (IN-21K, 384)	Top-1 accuracy %	79.5	# 5	Compare
Domain Generalization	ImageNet-A	ConvFormer-B36 (IN-21K)	Top-1 accuracy %	63.3	# 12	Compare
Domain Generalization	ImageNet-A	ConvFormer-B36 (IN-21K, 384)	Top-1 accuracy %	73.5	# 8	Compare
Domain Generalization	ImageNet-A	CAFormer-B36	Top-1 accuracy %	48.5	# 20	Compare
Domain Generalization	ImageNet-A	CAFormer-B36 (384)	Top-1 accuracy %	61.9	# 14	Compare
Domain Generalization	ImageNet-A	ConvFormer-B36	Top-1 accuracy %	40.1	# 23	Compare
Domain Generalization	ImageNet-C	CAFormer-B36 (IN21K, 384)	mean Corruption Error (mCE)	30.8	# 2	Compare
Domain Generalization	ImageNet-C	CAFormer-B36	mean Corruption Error (mCE)	42.6	# 18	Compare
Domain Generalization	ImageNet-C	ConvFormer-B36 (IN21K)	mean Corruption Error (mCE)	35.0	# 7	Compare
Domain Generalization	ImageNet-C	CAFormer-B36 (IN21K)	mean Corruption Error (mCE)	31.8	# 5	Compare
Domain Generalization	ImageNet-C	ConvFormer-B36	mean Corruption Error (mCE)	46.3	# 23	Compare
Domain Generalization	ImageNet-R	CAFormer-B36 (IN21K, 384)	Top-1 Error Rate	29.6	# 5	Compare
Domain Generalization	ImageNet-R	CAFormer-B36 (IN21K)	Top-1 Error Rate	31.7	# 7	Compare
Domain Generalization	ImageNet-R	ConvFormer-B36	Top-1 Error Rate	48.9	# 25	Compare
Domain Generalization	ImageNet-R	ConvFormer-B36 (384)	Top-1 Error Rate	47.8	# 24	Compare
Domain Generalization	ImageNet-R	CAFormer-B36 (384)	Top-1 Error Rate	45	# 21	Compare
Domain Generalization	ImageNet-R	CAFormer-B36	Top-1 Error Rate	46.1	# 23	Compare
Domain Generalization	ImageNet-R	ConvFormer-B36 (IN21K, 384)	Top-1 Error Rate	33.5	# 10	Compare
Domain Generalization	ImageNet-R	ConvFormer-B36 (IN21K)	Top-1 Error Rate	34.7	# 13	Compare
Domain Generalization	ImageNet-Sketch	ConvFormer-B36 (IN21K, 384)	Top-1 accuracy	52.9	# 7	Compare
Domain Generalization	ImageNet-Sketch	CAFormer-B36	Top-1 accuracy	42.5	# 17	Compare
Domain Generalization	ImageNet-Sketch	ConvFormer-B36	Top-1 accuracy	39.5	# 19	Compare
Domain Generalization	ImageNet-Sketch	CAFormer-B36 (IN21K, 384)	Top-1 accuracy	54.5	# 5	Compare
Domain Generalization	ImageNet-Sketch	ConvFormer-B36 (IN21K)	Top-1 accuracy	52.7	# 9	Compare
Domain Generalization	ImageNet-Sketch	CAFormer-B36 (IN21K)	Top-1 accuracy	52.8	# 8	Compare

Methods

Add Remove

Adam • ConvNeXt • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • MetaFormer • Multi-Head Attention • PoolFormer • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • StarReLU • Transformer

Edit Social Preview

MetaFormer Baselines for Vision

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove