Local Patch Interaction, or LPI, is a module used for the XCiT layer to enable explicit communication across patches. LPI consists of two depth-wise 3×3 convolutional layers with Batch Normalization and GELU non-linearity in between. Due to its depth-wise structure, the LPI block has a negligible overhead in terms of parameters, as well as a limited overhead in terms of throughput and memory usage during inference.
Source: XCiT: Cross-Covariance Image TransformersPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Classification | 3 | 33.33% |
Quantization | 1 | 11.11% |
Pose Estimation | 1 | 11.11% |
Instance Segmentation | 1 | 11.11% |
Object Detection | 1 | 11.11% |
Self-Supervised Image Classification | 1 | 11.11% |
Semantic Segmentation | 1 | 11.11% |
Component | Type |
|
---|---|---|
Batch Normalization
|
Normalization | |
Depthwise Convolution
|
Convolutions | |
GELU
|
Activation Functions |