Pyramid Vision Transformer v2 (PVTv2) is a type of Vision Transformer for detection and segmentation tasks. It improves on PVTv1 through several design improvements: (1) overlapping patch embedding, (2) convolutional feed-forward networks, and (3) linear complexity attention layers that are orthogonal to the PVTv1 framework.
Source: PVT v2: Improved Baselines with Pyramid Vision TransformerPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Camouflaged Object Segmentation | 1 | 16.67% |
Zero-shot Generalization | 1 | 16.67% |
COVID-19 Diagnosis | 1 | 16.67% |
Image Classification | 1 | 16.67% |
Object Detection | 1 | 16.67% |
Panoptic Segmentation | 1 | 16.67% |