Image Classification

3784 papers with code • 142 benchmarks • 240 datasets

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Benchmarks

Add a Result

These leaderboards are used to track progress in Image Classification

Dataset	Best Model	Compare
ImageNet	OmniVec(ViT)	See all
CIFAR-10	ViT-H/14	See all
CIFAR-100	EffNet-L2 (SAM)	See all
STL-10	VIT-L/16 (Spinal FC)	See all
ObjectNet	CoCa	See all
MNIST	Branching/Merging CNN + Homogeneous Vector Capsules	See all
SVHN	WRN28-10 (SAM)	See all
ImageNet ReaL	Baseline (ViT-G/14)	See all
iNaturalist 2018	OmniVec	See all
Flowers-102	CCT-14/7x2	See all
Clothing1M	LRA-diffusion (CC)	See all
mini WebVision 1.0	LRA-diffusion (CLIP ViT)	See all
VTAB-1k	ALIGN (50 hypers/task)	See all
ImageNet V2	Model soups (BASIC-L)	See all
Kuzushiji-MNIST	VGG-5 (Spinal FC)	See all
OmniBenchmark	NOAH-ViTB/16	See all
Fashion-MNIST	Fine-Tuning DARTS	See all
Stanford Cars	efficient adaptive ensembling	See all
Tiny ImageNet Classification	Astroformer	See all
DF20 - Mini	ViT-Large/16 (384)	See all
DF20	ViT-Large/16 (384)	See all
iNaturalist 2019	Hiera-H (448px)	See all
WebVision-1000	MAM (ViT-B/16)	See all
Places205	InternImage-H	See all
iNaturalist	Hiera-H (448px)	See all
EuroSAT	IMP+MTP(IntenImage-XL)	See all
RESISC45	ResNet50	See all
DTD	RADAM (ConvNeXt-L)	See all
CINIC-10	VIT-L/16 (Spinal FC, Background)	See all
EMNIST-Letters	WaveMixLite-112/16	See all
Clothing1M (using clean data)	CurriculumNet	See all
GasHisSDB	CoAtNet-1	See all
Tiered ImageNet 5-way (5-shot)	EGNN+Transduction	See all
smallNORB	Heinsen Routing	See all
EMNIST-Digits	WaveMixLite-112/16	See all
EMNIST-Balanced	WaveMixLite-128/7	See all
Food-101	Bamboo (ViTB/16)	See all
Oxford-IIIT Pets	efficient adaptive ensembling	See all
Colored-MNIST(with spurious correlation)	MLP-DecAug	See all
Places365	OmniVec(ViT)	See all
Oxford-IIIT Pet Dataset	TWIST (ResNet-50)	See all
Red MiniImageNet 20% label noise	NCR (ResNet-18)	See all
Red MiniImageNet 40% label noise	NCR (ResNet-18)	See all
Red MiniImageNet 80% label noise	NCR (ResNet-18)	See all
Food-101N	LRA-diffusion (CLIP ViT)	See all
ObjectNet (Bounding Box)	BiT-L (ResNet)	See all
Tiny-ImageNet	UPANets	See all
MAMe	EfficientNet-B3	See all
JFT-300M	V-MoE-H/14 (Every-2)	See all
N-MNIST	STS-ResNet	See all
Red MiniImageNet 60% label noise	InstanceGM-SS	See all
CUB	Entropy-based Logic Explained Network	See all
PlantVillage	adaptive minimal ensembling	See all
CIFAR-10 (with noisy labels)	PGDF (ResNet-18)	See all
Places365-Standard	SWAG (ViT H/14)	See all
Caltech-256	AG-Net	See all
Visual Wake Words	HyT-NAS-BA	See all
CelebA 64x64	cFlow	See all
SIPaKMeD	DL+PCA+GWO	See all
FlickrLogos-32	TC-VII (with outside data)	See all
Malaria Dataset	kEffNet-B0 V2 16ch	See all
Certificate Verification	ResMLP-24	See all
EuroSAT-SAR	FG-MAE (ViT-S/16)	See all
Imbalanced CUB-200-2011	Multi-task	See all
Noisy MNIST (AWGN)	PCGAN-CHAR	See all
Noisy MNIST (Motion)	PCGAN-CHAR	See all
Noisy MNIST (Contrast)	PCGAN-CHAR	See all
WebVision	LRA-diffusion	See all
Fracture/Normal Shoulder Bone X-ray Images on MURA	Our Ensemble Learning-2	See all
HErlev	Fuzzy Distance Ensemble	See all
CIFAR-10, 40% Symmetric Noise	FaMUS	See all
CIFAR-10, 60% Symmetric Noise	MentorMix	See all
CIFAR-100, 40% Symmetric Noise	FaMUS	See all
Large Labelled Logo Dataset (L3D)	L3D_original_2level	See all
Causal3DIdent	SimCLR	See all
CUB-200-2011	Sparse-CBM	See all
CLEVR/Count	SEER (RegNet10B)	See all
CLEVR/Dist	SEER (RegNet10B)	See all
ObjectNet (ImageNet classes)	Diffusion Classifier (zero-shot)	See all
Imagenette	µ2Net+ (ViT-L/16)	See all
ImageNet-100	SparseSwin with L2	See all
Galaxy10 DECals	WaveMix	See all
LIMUC	Inception-v3	See all
CIFAR-10 Image Classification	ASF-former-S	See all
N-Caltech 101	mMND (STDP)	See all
CIFAR-10 (40 Labels, ImageNet-100 Unlabeled)	UnMixMatch	See all
MultiMNIST	CapsNet	See all
QMNIST	VGG-5(Spinal FC)	See all
iCassava'19	E2E-3M	See all
cifar-10,4000	WRN-28-2 + UDA+AutoDropout	See all
ImageNet-10	ResNet-50 + UDA+AutoDropout	See all
LabelMe	CoNAL	See all
Surrey ASL	E2E-3M	See all
ArtDL	ResNet-50	See all
PASCAL VOC 2007	NNCLR	See all
MNIST-rot-12	PDO-eConv (ours)	See all
MNIST-rot-12k (DA)	PDO-eConv (ours)	See all
Sports10	Max Margin Contrastive	See all
CIFAR-100, 60% Symmetric Noise	MentorMix	See all
SARS-COV-2	Fuzzy rank-based fusion of CNN models using Gompertz function	See all
FGVC Aircraft	TransBoost-ResNet50	See all
ImageNet-Sketch	µ2Net+ (ViT-L/16)	See all
Chaoyang	HSANR	See all
PRImA	ResNet-152 2x (RS training)	See all
ISBNet	ThanosNet	See all
FGVC-Aircraft	EnGraf-Net101 (G=4, H=1)	See all
KITTI-Dist	SEER (RegNet10B)	See all
EMNIST-Byclass	ResNet-18	See all
EMNIST-Bymerge	WaveMixLite-128/16	See all
iNat2021-mini	WaveMix-256/16 (level 2)	See all
AmsterTime	AP-GeM (ResNet-101)	See all
PlantDoc	kMobileNet V3 Large 16ch	See all
cifar100	shreynet	See all
CIFAR-100 (alpha=0, 20 clients per round)	FedAvgM + ASAM + SWA	See all
KMNIST	µ2Net	See all
Stanford Online Products	µ2Net+ (ViT-L/16)	See all
CARS196	µ2Net+ (ViT-L/16)	See all
So2Sat LCZ42	ResNet50	See all
SUN397	TransBoost-ResNet50	See all
FEMNIST	pFedBreD_ns_mg	See all
ImageNet-9	SqueezeNet + Simple Bypass	See all
ImageNet-P	SqueezeNet + Simple Bypass	See all
cats_vs_dogs	µ2Net+ (ViT-L/16)	See all
BreakHis	WaveMix-224/10	See all
DVS128 Gesture		See all
KTH-TIPS2	RADAM (ConvNeXt-XL)	See all
FMD (materials)	RADAM (ConvNeXt-L)	See all
Deep PCB	ResNet	See all
mnist	WaveMixLite	See all
Intel Image Classification	ResNet	See all
ImageNet-32	WRN (N=28, k=10)	See all
ImageNet-64	WRN (N=36, k=5)	See all
VizWiz-Classification	VOLO-D5	See all
Split M-NIST	Model with negotiation paradigm	See all
Split Fashion M-NIST	Model with negotiation paradigm	See all
Split CIFAR-10	Model with negotiation paradigm	See all
split CIFAR-100	Model with negotiation paradigm	See all
RGB Arabic Alphabet Sign Language (AASL) dataset	ArabSignNet	See all
No Background RGB Arabic Alphabets Sign Language Dataset	ArabSignNet	See all
cifar10	SAM	See all
ImageNet-Hard	EfficientNet-L2-Ns	See all

Show all 142 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Image Classification models and implementations

rwightman/pytorch-image-models

80 papers

29,774

osmr/imgclsmob

51 papers

2,917

Westlake-AI/openmixup

36 papers

570

open-mmlab/mmclassification

30 papers

3,157

See all 45 libraries.

Datasets

Subtasks

Semi-Supervised Image Classification

Learning with noisy labels

Hyperspectral Image Classification

Self-Supervised Image Classification

Small Data Image Classification

Multi-Label Image Classification

Genre classification

Sequential Image Classification

Unsupervised Image Classification

Efficient ViTs

Document Image Classification

Satellite Image Classification

Sparse Representation-based Classification

Photo geolocation estimation

Image Classification with Differential Privacy

Token Reduction

Superpixel Image Classification

Classification Consistency

Gallbladder Cancer Detection

Artistic style classification

Artist classification

Temporal Metadata Manipulation Detection

Misclassification Rate - Natural Adversarial Samples

Scale Generalisation

Most implemented papers

Most implemented Social Latest No code

Deep Residual Learning for Image Recognition

tensorflow/models • • CVPR 2016

Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

469

Paper
Code

Very Deep Convolutional Networks for Large-Scale Image Recognition

tensorflow/models • • 4 Sep 2014

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting.

301

Paper
Code

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

tensorflow/tensorflow • • 17 Apr 2017

We present a class of efficient models called MobileNets for mobile and embedded vision applications.

153

Paper
Code

MobileNetV2: Inverted Residuals and Linear Bottlenecks

tensorflow/models • • CVPR 2018

In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes.

148

Paper
Code

Densely Connected Convolutional Networks

liuzhuang13/DenseNet • • CVPR 2017

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output.

143

Paper
Code

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

google-research/vision_transformer • • ICLR 2021

While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.

143

Paper
Code

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

tensorflow/tpu • • ICML 2019

Convolutional Neural Networks (ConvNets) are commonly developed at a fixed resource budget, and then scaled up for better accuracy if more resources are available.

133

Paper
Code