LeVIT Explained | Papers With Code

LeVIT is a hybrid neural network for fast inference image classification. LeViT is a stack of transformer blocks, with pooling steps to reduce the resolution of the activation maps as in classical convolutional architectures. This replaces the uniform structure of a Transformer by a pyramid with pooling, similar to the LeNet architecture

Source: LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Anomaly Detection	1	33.33%
General Classification	1	33.33%
Image Classification	1	33.33%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
1x1 Convolution	Convolutions
Batch Normalization	Normalization
Convolution	Convolutions
Hard Swish	Activation Functions
Layer Normalization	Normalization
LeViT Attention Block	Attention Modules
Multi-Head Attention	Attention Modules
Scaled Dot-Product Attention	Attention Mechanisms
Softmax	Output Functions

Categories

Add Remove

Vision Transformers

LeVIT

Papers

Tasks

Usage Over Time

Components

Categories Edit Add Remove

Categories

Add Remove