Fine-Grained Image Classification

173 papers with code • 35 benchmarks • 36 datasets

Fine-Grained Image Classification is a task in computer vision where the goal is to classify images into subcategories within a larger category. For example, classifying different species of birds or different types of flowers. This task is considered to be fine-grained because it requires the model to distinguish between subtle differences in visual appearance and patterns, making it more challenging than regular image classification tasks.

( Image credit: Looking for the Devil in the Details )

Benchmarks

Add a Result

These leaderboards are used to track progress in Fine-Grained Image Classification

Dataset	Best Model	Compare
Stanford Cars	CMAL-Net	See all
CUB-200-2011	HERBS	See all
FGVC Aircraft	SR-GNN	See all
Oxford 102 Flowers	VIT-L/16 (Background)	See all
CUB-200-2011	HERBS	See all
NABirds	MetaFormer (MetaFormer-2,384)	See all
Stanford Dogs	SR-GNN	See all
Oxford-IIIT Pet Dataset	OmniVec	See all
Food-101	CAP	See all
Caltech-101	VIT-L/16	See all
Oxford-IIIT Pets	EffNet-L2 (SAM)	See all
CompCars	ResNet101-swp	See all
Birdsnap	EffNet-L2 (SAM)	See all
Bird-225	WideResNet-101 (Spinal FC)	See all
SUN397	µ2Net (ViT-L/16)	See all
10 Monkey Species	Inception-v3 (Spinal FC)	See all
Fruits-360	ResNeXt-101	See all
FoodX-251	CSWin-L	See all
Imbalanced CUB-200-2011	PC-Softmax	See all
SOP	Assemble-ResNet-FGVC-50	See all
Con-Text	PHOC descriptor + Fisher Vector Encoding	See all
Bottles	PHOC descriptor + Fisher Vector Encoding	See all
MNIST	Vanilla FC layer only	See all
EMNIST-Digits	VGG-5	See all
EMNIST-Letters	VGG-5	See all
QMNIST	VGG-5	See all
Kuzushiji-MNIST	VGG-5	See all
STL-10	Pre trained wide-resnet-101	See all
BoxCars116K	ResNet152 + COOC	See all
CarFlag-1532	ResNet101-swp	See all
CarFlag-563	ResNet101-swp	See all
iNaturalist	TASN	See all
FGVC-Aircraft	EnGraf-Net101 (G=4, H=1)	See all
Herbarium 2021 Half–Earth	Conviformer-B	See all
Herbarium 2022	Conviformer-B	See all

Show all 35 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Fine-Grained Image Classification models and implementations

rwightman/pytorch-image-models

7 papers

29,890

open-mmlab/mmclassification

4 papers

3,177

osmr/imgclsmob

4 papers

2,923

Westlake-AI/openmixup

4 papers

574

See all 25 libraries.

Datasets

Subtasks

Displaced People Recognition

Latest papers with no code

Most implemented Social Latest No code

Adaptive Fine-Grained Predicates Learning for Scene Graph Generation

no code yet • 11 Jul 2022

The performance of current Scene Graph Generation (SGG) models is severely hampered by hard-to-distinguish predicates, e. g., woman-on/standing on/walking on-beach.

Paper
Add Code

Large Neural Networks Learning from Scratch with Very Few Data and without Explicit Regularization

no code yet • 18 May 2022

We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining.

Paper
Add Code

Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification

no code yet • CVPR 2022

First, we propose global-local cross-attention (GLCA) to enhance the interactions between global images and local high-response regions, which can help reinforce the spatial-wise discriminative clues for recognition.

Paper
Add Code

Reinforcing Generated Images via Meta-learning for One-Shot Fine-Grained Visual Recognition

no code yet • 22 Apr 2022

One-shot fine-grained visual recognition often suffers from the problem of having few training examples for new fine-grained classes.

Paper
Add Code

ViT-FOD: A Vision Transformer based Fine-grained Object Discriminator

no code yet • 24 Mar 2022

Recently, several Vision Transformer (ViT) based methods have been proposed for Fine-Grained Visual Classification (FGVC). These methods significantly surpass existing CNN-based ones, demonstrating the effectiveness of ViT in FGVC tasks. However, there are some limitations when applying ViT directly to FGVC. First, ViT needs to split images into patches and calculate the attention of every pair, which may result in heavy redundant calculation and unsatisfying performance when handling fine-grained images with complex background and small objects. Second, a standard ViT only utilizes the class token in the final layer for classification, which is not enough to extract comprehensive fine-grained information.

Paper
Add Code

Automatic Fine-grained Glomerular Lesion Recognition in Kidney Pathology

no code yet • 11 Mar 2022

Recognition of glomeruli lesions is the key for diagnosis and treatment planning in kidney pathology; however, the coexisting glomerular structures such as mesangial regions exacerbate the difficulties of this task.

Paper
Add Code

Bridge the Gap between Supervised and Unsupervised Learning for Fine-Grained Classification

no code yet • 1 Mar 2022

Unsupervised learning technology has caught up with or even surpassed supervised learning technology in general object classification (GOC) and person re-identification (re-ID).

Paper
Add Code

Progressive Multi-stage Interactive Training in Mobile Network for Fine-grained Recognition

no code yet • 8 Dec 2021

Finally, using the progressive training (P), the features extracted by the model in different stages can be fully utilized and fused with each other.

Paper
Add Code

Improved Robustness of Vision Transformer via PreLayerNorm in Patch Embedding

no code yet • 16 Nov 2021

We compared the robustness of CNN and ViT by assuming various image corruptions that may appear in practical vision tasks.

Paper
Add Code

A free lunch from ViT:Adaptive Attention Multi-scale Fusion Transformer for Fine-grained Visual Recognition

no code yet • 4 Oct 2021

Learning subtle representation about object parts plays a vital role in fine-grained visual recognition (FGVR) field.

Paper
Add Code

Fine-Grained Image Classification

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result