TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Classification	CIFAR-10	SparseSwin	Percentage correct	97.43	# 77
Image Classification	CIFAR-10	SparseSwin	PARAMS	17.58M	# 203
Image Classification	CIFAR-100	SparseSwin	Percentage correct	85.35	# 65
Image Classification	CIFAR-100	SparseSwin	PARAMS	17.58M	# 190
Image Classification	ImageNet-100	SparseSwin with L2	Percentage correct	86.96	# 1
Image Classification	ImageNet-100	SparseSwin with L2	Params	17.58M	# 2

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sparseswin-swin-transformer-with-sparse/image-classification-on-imagenet-100)](https://paperswithcode.com/sota/image-classification-on-imagenet-100?p=sparseswin-swin-transformer-with-sparse)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sparseswin-swin-transformer-with-sparse/image-classification-on-cifar-100)](https://paperswithcode.com/sota/image-classification-on-cifar-100?p=sparseswin-swin-transformer-with-sparse)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sparseswin-swin-transformer-with-sparse/image-classification-on-cifar-10)](https://paperswithcode.com/sota/image-classification-on-cifar-10?p=sparseswin-swin-transformer-with-sparse)`

SparseSwin: Swin Transformer with Sparse Transformer Block

11 Sep 2023 · Krisna Pinasthika, Blessius Sheldo Putra Laksono, Riyandi Banovbi Putera Irsal, Syifa Hukma Shabiyya, Novanto Yudistira ·

Advancements in computer vision research have put transformer architecture as the state of the art in computer vision tasks. One of the known drawbacks of the transformer architecture is the high number of parameters, this can lead to a more complex and inefficient algorithm. This paper aims to reduce the number of parameters and in turn, made the transformer more efficient. We present Sparse Transformer (SparTa) Block, a modified transformer block with an addition of a sparse token converter that reduces the number of tokens used. We use the SparTa Block inside the Swin T architecture (SparseSwin) to leverage Swin capability to downsample its input and reduce the number of initial tokens to be calculated. The proposed SparseSwin model outperforms other state of the art models in image classification with an accuracy of 86.96%, 97.43%, and 85.35% on the ImageNet100, CIFAR10, and CIFAR100 datasets respectively. Despite its fewer parameters, the result highlights the potential of a transformer architecture using a sparse token converter with a limited number of tokens to optimize the use of the transformer and improve its performance.

PDF Abstract

Code

Add Remove Mark official

krisnapinasthika/sparseswin official

Tasks

Add Remove

Image Classification

Datasets

CIFAR-10

ImageNet

CIFAR-100 ImageNet-1K

Results from the Paper

Edit

Ranked #1 on Image Classification on ImageNet-100

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Classification	CIFAR-10	SparseSwin	Percentage correct	97.43	# 77	Compare
Image Classification	CIFAR-10	SparseSwin	PARAMS	17.58M	# 203	Compare
Image Classification	CIFAR-100	SparseSwin	Percentage correct	85.35	# 65	Compare
Image Classification	CIFAR-100	SparseSwin	PARAMS	17.58M	# 190	Compare
Image Classification	ImageNet-100	SparseSwin with L2	Percentage correct	86.96	# 1	Compare
Image Classification	ImageNet-100	SparseSwin with L2	Params	17.58M	# 2	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • Attention Dropout • BPE • Cosine Annealing • Dense Connections • Dropout • GELU • Label Smoothing • Layer Normalization • Linear Layer • Linear Warmup With Cosine Annealing • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Sparse Transformer • Transformer • Weight Decay

Edit Social Preview

SparseSwin: Swin Transformer with Sparse Transformer Block

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove