OODformer

Introduced by Koner et al. in OODformer: Out-Of-Distribution Detection Transformer

OODformer is a transformer-based OOD detection architecture that leverages the contextualization capabilities of the transformer. Incorporating the transformer as the principal feature extractor allows to exploit the object concepts and their discriminate attributes along with their co-occurrence via visual attention.

OODformer employs ViT and its data efficient variant DeiT. Each encoder layer consist of multi-head self attention and a multi-layer perception block. The combination of MSA and MLP layers in the encoder jointly encode the attributes' importance, associated correlation, and co-occurrence. The [class] token (a representative of an image $x$) consolidated multiple attributes and their related features via the global context. The [class] token from the final layer is used for OOD detection in two ways; first, it is passed to $ F_{\text {classifier }}\left(x_{\text {feat }}\right)$ for softmax confidence score, and second it is used for latent space distance calculation.

Source: OODformer: Out-Of-Distribution Detection Transformer

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Classification	1	33.33%
Out-of-Distribution Detection	1	33.33%
Out of Distribution (OOD) Detection	1	33.33%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Vision Transformers