Charformer is a type of Transformer model that learns a subword tokenization end-to-end as part of the model. Specifically it uses GBST that automatically learns latent subword representations from characters in a data-driven fashion. Following GBST, the soft subword sequence is passed through Transformer layers.
Source: Charformer: Fast Character Transformers via Gradient-based Subword TokenizationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
NMT | 2 | 18.18% |
Denoising | 1 | 9.09% |
Image Denoising | 1 | 9.09% |
Translation | 1 | 9.09% |
Toxic Comment Classification | 1 | 9.09% |
Linguistic Acceptability | 1 | 9.09% |
Natural Language Inference | 1 | 9.09% |
Paraphrase Identification | 1 | 9.09% |
Semantic Textual Similarity | 1 | 9.09% |
Component | Type |
|
---|---|---|
Gradient-Based Subword Tokenization
|
Subword Segmentation | |
Transformer
|
Transformers |