Spatial-Channel Token Distillation

Introduced by Li et al. in Spatial-Channel Token Distillation for Vision MLPs

The Spatial-Channel Token Distillation method is proposed to improve the spatial and channel mixing from a novel knowledge distillation (KD) perspective. To be specific, we design a special KD mechanism for MLP-like Vision Models called Spatial-channel Token Distillation (STD), which improves the information mixing in both the spatial and channel dimensions of MLP blocks. Instead of modifying the mixing operations themselves, STD adds spatial and channel tokens to image patches. After forward propagation, the tokens are concatenated for distillation with the teachers’ responses as targets. Each token works as an aggregator of its dimension. The objective of them is to encourage each mixing operation to extract maximal task-related information from their specific dimension.

Source: Spatial-Channel Token Distillation for Vision MLPs

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Adversarial Robustness	1	4.76%
Management	1	4.76%
Sentiment Analysis	1	4.76%
Text Classification	1	4.76%
Self-Supervised Learning	1	4.76%
Uncertainty Quantification	1	4.76%
Document Shadow Removal	1	4.76%
Shadow Removal	1	4.76%
Object Detection	1	4.76%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Knowledge Distillation