LeVIT is a hybrid neural network for fast inference image classification. LeViT is a stack of transformer blocks, with pooling steps to reduce the resolution of the activation maps as in classical convolutional architectures. This replaces the uniform structure of a Transformer by a pyramid with pooling, similar to the LeNet architecture
Source: LeViT: a Vision Transformer in ConvNet's Clothing for Faster InferencePaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Anomaly Detection | 1 | 33.33% |
General Classification | 1 | 33.33% |
Image Classification | 1 | 33.33% |