Language Models

PanGu-$α$ is an autoregressive language model (ALM) with up to 200 billion parameters pretrained on a large corpus of text, mostly in Chinese language. The architecture of PanGu-$α$ is based on Transformer, which has been extensively used as the backbone of a variety of pretrained language models such as BERT and GPT. Different from them, there's an additional query layer developed on top of Transformer layers which aims to explicitly induce the expected output.

Source: PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

Papers


Paper Code Results Date Stars

Components


Component Type
Transformer
Transformers

Categories