Fine-Grained Image Classification
172 papers with code • 35 benchmarks • 36 datasets
Fine-Grained Image Classification is a task in computer vision where the goal is to classify images into subcategories within a larger category. For example, classifying different species of birds or different types of flowers. This task is considered to be fine-grained because it requires the model to distinguish between subtle differences in visual appearance and patterns, making it more challenging than regular image classification tasks.
( Image credit: Looking for the Devil in the Details )
Libraries
Use these libraries to find Fine-Grained Image Classification models and implementationsDatasets
Latest papers
Parameter-Efficient Long-Tailed Recognition
In this paper, we propose PEL, a fine-tuning method that can effectively adapt pre-trained models to long-tailed recognition tasks in fewer than 20 epochs without the need for extra data.
Masking Strategies for Background Bias Removal in Computer Vision Models
Models for fine-grained image classification tasks, where the difference between some classes can be extremely subtle and the number of samples per class tends to be low, are particularly prone to picking up background-related biases and demand robust methods to handle potential examples with out-of-distribution (OOD) backgrounds.
Multiscale patch-based feature graphs for image classification
We compared our approach with two conventional approaches for dealing with image classification.
Task-Oriented Channel Attention for Fine-Grained Few-Shot Classification
While TDM influences high-level feature maps by task-adaptive calibration of channel-wise importance, we further introduce Instance Attention Module (IAM) operating in intermediate layers of feature extractors to instance-wisely highlight object-relevant channels, by extending QAM.
GIST: Generating Image-Specific Text for Fine-grained Object Classification
We demonstrate the utility of GIST by fine-tuning vision-language models on the image-and-generated-text pairs to learn an aligned vision-language representation space for improved classification.
Diffusion Models Beat GANs on Image Classification
We explore optimal methods for extracting and using these embeddings for classification tasks, demonstrating promising results on the ImageNet classification task.
TOAST: Transfer Learning via Attention Steering
We introduce Top-Down Attention Steering (TOAST), a novel transfer learning algorithm that keeps the pre-trained backbone frozen, selects task-relevant features in the output, and feeds those features back to the model to steer the attention to the task-specific features.
Salient Mask-Guided Vision Transformer for Fine-Grained Classification
Fine-grained visual classification (FGVC) is a challenging computer vision problem, where the task is to automatically recognise objects from subordinate categories.
Reduction of Class Activation Uncertainty with Background Information
Through the class activation mappings (CAMs) of the trained models, we observed the tendency towards looking at a bigger picture with the proposed model training methodology.
Learning Partial Correlation based Deep Visual Representation for Image Classification
Our work obtains a partial correlation based deep visual representation and mitigates the small sample problem often encountered by covariance matrix estimation in CNN.