Cedille: A large autoregressive French language model

7 Feb 2022  ·  Martin Müller, Florian Laurent ·

Scaling up the size and training of autoregressive language models has enabled novel ways of solving Natural Language Processing tasks using zero-shot and few-shot learning. While extreme-scale language models such as GPT-3 offer multilingual capabilities, zero-shot learning for languages other than English remain largely unexplored. Here, we introduce Cedille, a large open source auto-regressive language model, specifically trained for the French language. Our results show that Cedille outperforms existing French language models and is competitive with GPT-3 on a range of French zero-shot benchmarks. Furthermore, we provide an in-depth comparison of the toxicity exhibited by these models, showing that Cedille marks an improvement in language model safety thanks to dataset filtering.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods