Image Classification
3715 papers with code • 165 benchmarks • 239 datasets
Image Classification is a fundamental task in vision recognition that aims to understand and categorize an image as a whole under a specific label. Unlike object detection, which involves classification and location of multiple objects within an image, image classification typically pertains to single-object images. When the classification becomes highly detailed or reaches instance-level, it is often referred to as image retrieval, which also involves finding similar images in a large database.
Libraries
Use these libraries to find Image Classification models and implementationsDatasets
Subtasks
- Out of Distribution (OOD) Detection
- Few-Shot Image Classification
- Fine-Grained Image Classification
- Semi-Supervised Image Classification
- Semi-Supervised Image Classification
- Learning with noisy labels
- Hyperspectral Image Classification
- Self-Supervised Image Classification
- Small Data Image Classification
- Multi-Label Image Classification
- Genre classification
- Sequential Image Classification
- Unsupervised Image Classification
- Document Image Classification
- Satellite Image Classification
- Sparse Representation-based Classification
- Photo geolocation estimation
- Image Classification with Differential Privacy
- Superpixel Image Classification
- Classification Consistency
- Gallbladder Cancer Detection
- Artistic style classification
- Artist classification
- Temporal Metadata Manipulation Detection
- Misclassification Rate - Natural Adversarial Samples
- Scale Generalisation
Latest papers
The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer Vision
We empirically examine our findings in a comprehensive evaluation with multiple image classification models and show that our attack achieves the same sparsity effect as prior sponge-example methods, but at a fraction of computation effort.
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information.
The Need for Speed: Pruning Transformers with One Recipe
We introduce the $\textbf{O}$ne-shot $\textbf{P}$runing $\textbf{T}$echnique for $\textbf{I}$nterchangeable $\textbf{N}$etworks ($\textbf{OPTIN}$) framework as a tool to increase the efficiency of pre-trained transformer architectures $\textit{without requiring re-training}$.
Tiny Models are the Computational Saver for Large Models
By searching and employing the most appropriate tiny model as the computational saver for a given large model, the proposed approaches work as a novel and generic method to model compression.
DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks
Our tool contributes to the wider adoption of AI-based Gleason grading within the research community and paves the way for broader clinical application of deep learning models in digital pathology.
Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships
Modeling and visualizing relationships between tasks or datasets is an important step towards solving various meta-tasks such as dataset discovery, multi-tasking, and transfer learning.
Histogram Layers for Neural Engineered Features
These engineered features include local binary patterns and edge histogram descriptors among others and they have been shown to be informative features for a variety of computer vision tasks.
CBGT-Net: A Neuromimetic Architecture for Robust Classification of Streaming Data
This paper describes CBGT-Net, a neural network model inspired by the cortico-basal ganglia-thalamic (CBGT) circuits found in mammalian brains.
iDAT: inverse Distillation Adapter-Tuning
Adapter-Tuning (AT) method involves freezing a pre-trained model and introducing trainable adapter modules to acquire downstream knowledge, thereby calibrating the model for better adaptation to downstream tasks.
VLM-CPL: Consensus Pseudo Labels from Vision-Language Models for Human Annotation-Free Pathological Image Classification
To address this issue, we introduce VLM-CPL, a novel approach based on consensus pseudo labels that integrates two noisy label filtering techniques with a semi-supervised learning strategy.