ELECTRA is a transformer with a new pre-training approach which trains two transformer models: the generator and the discriminator. The generator replaces tokens in the sequence - trained as a masked language model - and the discriminator (the ELECTRA contribution) attempts to identify which tokens are replaced by the generator in the sequence. This pre-training task is called replaced token detection, and is a replacement for masking the input.
Source:TASK | PAPERS | SHARE |
---|---|---|
Language Modelling | 5 | 23.81% |
Document Classification | 2 | 9.52% |
Reading Comprehension | 2 | 9.52% |
Meta-Learning | 1 | 4.76% |
Document Ranking | 1 | 4.76% |
Learning-To-Rank | 1 | 4.76% |
Passage Re-Ranking | 1 | 4.76% |
Recommendation Systems | 1 | 4.76% |
Named Entity Recognition | 1 | 4.76% |