Image Classification

3802 papers with code • 142 benchmarks • 238 datasets

Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.

Source: Metamorphic Testing for Object Detection Systems

Benchmarks

Add a Result

These leaderboards are used to track progress in Image Classification

Dataset	Best Model	Compare
ImageNet	OmniVec(ViT)	See all
CIFAR-10	ViT-H/14	See all
CIFAR-100	EffNet-L2 (SAM)	See all
STL-10	VIT-L/16 (Spinal FC)	See all
ObjectNet	CoCa	See all
MNIST	Branching/Merging CNN + Homogeneous Vector Capsules	See all
SVHN	WRN28-10 (SAM)	See all
ImageNet ReaL	Baseline (ViT-G/14)	See all
iNaturalist 2018	OmniVec	See all
Flowers-102	CCT-14/7x2	See all
Clothing1M	LRA-diffusion (CC)	See all
mini WebVision 1.0	LRA-diffusion (CLIP ViT)	See all
VTAB-1k	ALIGN (50 hypers/task)	See all
ImageNet V2	Model soups (BASIC-L)	See all
Kuzushiji-MNIST	VGG-5 (Spinal FC)	See all
OmniBenchmark	NOAH-ViTB/16	See all
Fashion-MNIST	Fine-Tuning DARTS	See all
Stanford Cars	efficient adaptive ensembling	See all
Tiny ImageNet Classification	Astroformer	See all
DF20 - Mini	ViT-Large/16 (384)	See all
DF20	ViT-Large/16 (384)	See all
iNaturalist 2019	Hiera-H (448px)	See all
WebVision-1000	MAM (ViT-B/16)	See all
Places205	InternImage-H	See all
iNaturalist	Hiera-H (448px)	See all
EuroSAT	IMP+MTP(IntenImage-XL)	See all
RESISC45	ResNet50	See all
DTD	Linear FT(ViT-L/14)	See all
CINIC-10	VIT-L/16 (Spinal FC, Background)	See all
EMNIST-Letters	WaveMixLite-112/16	See all
Clothing1M (using clean data)	CurriculumNet	See all
GasHisSDB	CoAtNet-1	See all
Tiered ImageNet 5-way (5-shot)	EGNN+Transduction	See all
smallNORB	Heinsen Routing	See all
EMNIST-Digits	WaveMixLite-112/16	See all
EMNIST-Balanced	WaveMixLite-128/7	See all
Food-101	Bamboo (ViTB/16)	See all
Oxford-IIIT Pets	efficient adaptive ensembling	See all
Colored-MNIST(with spurious correlation)	MLP-DecAug	See all
Places365	OmniVec(ViT)	See all
Oxford-IIIT Pet Dataset	TWIST (ResNet-50)	See all
Red MiniImageNet 20% label noise	NCR (ResNet-18)	See all
Red MiniImageNet 40% label noise	NCR (ResNet-18)	See all
Red MiniImageNet 80% label noise	NCR (ResNet-18)	See all
Food-101N	LRA-diffusion (CLIP ViT)	See all
ObjectNet (Bounding Box)	BiT-L (ResNet)	See all
Tiny-ImageNet	UPANets	See all
MAMe	EfficientNet-B3	See all
JFT-300M	V-MoE-H/14 (Every-2)	See all
N-MNIST	STS-ResNet	See all
Red MiniImageNet 60% label noise	InstanceGM-SS	See all
CUB	Entropy-based Logic Explained Network	See all
PlantVillage	adaptive minimal ensembling	See all
CIFAR-10 (with noisy labels)	PGDF (ResNet-18)	See all
Places365-Standard	SWAG (ViT H/14)	See all
Caltech-256	AG-Net	See all
Visual Wake Words	HyT-NAS-BA	See all
CelebA 64x64	cFlow	See all
SIPaKMeD	DL+PCA+GWO	See all
FlickrLogos-32	TC-VII (with outside data)	See all
Malaria Dataset	kEffNet-B0 V2 16ch	See all
Certificate Verification	ResMLP-24	See all
EuroSAT-SAR	FG-MAE (ViT-S/16)	See all
Imbalanced CUB-200-2011	Multi-task	See all
Noisy MNIST (AWGN)	PCGAN-CHAR	See all
Noisy MNIST (Motion)	PCGAN-CHAR	See all
Noisy MNIST (Contrast)	PCGAN-CHAR	See all
WebVision	LRA-diffusion	See all
Fracture/Normal Shoulder Bone X-ray Images on MURA	Our Ensemble Learning-2	See all
HErlev	Fuzzy Distance Ensemble	See all
CIFAR-10, 40% Symmetric Noise	FaMUS	See all
CIFAR-10, 60% Symmetric Noise	MentorMix	See all
CIFAR-100, 40% Symmetric Noise	FaMUS	See all
Large Labelled Logo Dataset (L3D)	L3D_original_2level	See all
Causal3DIdent	SimCLR	See all
CUB-200-2011	Sparse-CBM	See all
CLEVR/Count	SEER (RegNet10B)	See all
CLEVR/Dist	SEER (RegNet10B)	See all
ObjectNet (ImageNet classes)	Diffusion Classifier (zero-shot)	See all
Imagenette	µ2Net+ (ViT-L/16)	See all
ImageNet-100	SparseSwin with L2	See all
Galaxy10 DECals	WaveMix	See all
LIMUC	Inception-v3	See all
CIFAR-10 Image Classification	ASF-former-S	See all
N-Caltech 101	mMND (STDP)	See all
CIFAR-10 (40 Labels, ImageNet-100 Unlabeled)	UnMixMatch	See all
MultiMNIST	CapsNet	See all
QMNIST	VGG-5(Spinal FC)	See all
iCassava'19	E2E-3M	See all
cifar-10,4000	WRN-28-2 + UDA+AutoDropout	See all
ImageNet-10	ResNet-50 + UDA+AutoDropout	See all
LabelMe	CoNAL	See all
Surrey ASL	E2E-3M	See all
ArtDL	ResNet-50	See all
PASCAL VOC 2007	NNCLR	See all
MNIST-rot-12	PDO-eConv (ours)	See all
MNIST-rot-12k (DA)	PDO-eConv (ours)	See all
Sports10	Max Margin Contrastive	See all
CIFAR-100, 60% Symmetric Noise	MentorMix	See all
SARS-COV-2	Fuzzy rank-based fusion of CNN models using Gompertz function	See all
FGVC Aircraft	TransBoost-ResNet50	See all
ImageNet-Sketch	µ2Net+ (ViT-L/16)	See all
Chaoyang	HSANR	See all
PRImA	ResNet-152 2x (RS training)	See all
ISBNet	ThanosNet	See all
FGVC-Aircraft	EnGraf-Net101 (G=4, H=1)	See all
KITTI-Dist	SEER (RegNet10B)	See all
EMNIST-Byclass	ResNet-18	See all
EMNIST-Bymerge	WaveMixLite-128/16	See all
iNat2021-mini	WaveMix-256/16 (level 2)	See all
AmsterTime	AP-GeM (ResNet-101)	See all
PlantDoc	kMobileNet V3 Large 16ch	See all
cifar100	shreynet	See all
CIFAR-100 (alpha=0, 20 clients per round)	FedAvgM + ASAM + SWA	See all
KMNIST	µ2Net	See all
Stanford Online Products	µ2Net+ (ViT-L/16)	See all
CARS196	µ2Net+ (ViT-L/16)	See all
So2Sat LCZ42	ResNet50	See all
SUN397	TransBoost-ResNet50	See all
FEMNIST	pFedBreD_ns_mg	See all
ImageNet-9	SqueezeNet + Simple Bypass	See all
ImageNet-P	SqueezeNet + Simple Bypass	See all
cats_vs_dogs	µ2Net+ (ViT-L/16)	See all
BreakHis	WaveMix-224/10	See all
DVS128 Gesture		See all
KTH-TIPS2	RADAM (ConvNeXt-XL)	See all
FMD (materials)	RADAM (ConvNeXt-L)	See all
Deep PCB	ResNet	See all
mnist	WaveMixLite	See all
Intel Image Classification	ResNet	See all
ImageNet-32	WRN (N=28, k=10)	See all
ImageNet-64	WRN (N=36, k=5)	See all
VizWiz-Classification	VOLO-D5	See all
Split M-NIST	Model with negotiation paradigm	See all
Split Fashion M-NIST	Model with negotiation paradigm	See all
Split CIFAR-10	Model with negotiation paradigm	See all
split CIFAR-100	Model with negotiation paradigm	See all
RGB Arabic Alphabet Sign Language (AASL) dataset	ArabSignNet	See all
No Background RGB Arabic Alphabets Sign Language Dataset	ArabSignNet	See all
cifar10	SAM	See all
ImageNet-Hard	EfficientNet-L2-Ns	See all

Show all 142 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Image Classification models and implementations

rwightman/pytorch-image-models

80 papers

29,863

osmr/imgclsmob

51 papers

2,919

Westlake-AI/openmixup

36 papers

574

open-mmlab/mmclassification

30 papers

3,173

See all 45 libraries.

Datasets

Subtasks

Semi-Supervised Image Classification

Learning with noisy labels

Hyperspectral Image Classification

Self-Supervised Image Classification

Small Data Image Classification

Multi-Label Image Classification

Genre classification

Sequential Image Classification

Unsupervised Image Classification

Efficient ViTs

Document Image Classification

Satellite Image Classification

Sparse Representation-based Classification

Photo geolocation estimation

Image Classification with Differential Privacy

Token Reduction

Superpixel Image Classification

Classification Consistency

Gallbladder Cancer Detection

Artistic style classification

Artist classification

Temporal Metadata Manipulation Detection

Misclassification Rate - Natural Adversarial Samples

Scale Generalisation

Most implemented papers

Most implemented Social Latest No code

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

microsoft/Swin-Transformer • • ICCV 2021

This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.

Paper
Code

Pyramid Scene Parsing Network

hszhao/PSPNet • • CVPR 2017

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.

Paper
Code

Searching for MobileNetV3

tensorflow/models • • ICCV 2019

We achieve new state of the art results for mobile classification, detection and segmentation.

Paper
Code

Explaining and Harnessing Adversarial Examples

tensorflow/cleverhans • • 20 Dec 2014

Several machine learning models, including neural networks, consistently misclassify adversarial examples---inputs formed by applying small but intentionally worst-case perturbations to examples from the dataset, such that the perturbed input results in the model outputting an incorrect answer with high confidence.

Paper
Code

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size

DeepScale/SqueezeNet • • 24 Feb 2016

(2) Smaller DNNs require less bandwidth to export a new model from the cloud to an autonomous car.

Paper
Code

Aggregated Residual Transformations for Deep Neural Networks

facebookresearch/ResNeXt • • CVPR 2017

Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set.

Paper
Code

Towards Deep Learning Models Resistant to Adversarial Attacks

MadryLab/mnist_challenge • • ICLR 2018

Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

Paper
Code