Transformers

Nyströmformer replaces the self-attention in BERT-small and BERT-base using the proposed Nyström approximation. This reduces self-attention complexity to $O(n)$ and allows the Transformer to support longer sequences.

Source: Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention

Papers


Paper Code Results Date Stars

Tasks


Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories