Language Modelling

4482 papers with code • 51 benchmarks • 157 datasets

Language Modeling is the task of predicting the next word or character in a document. This technique can be used to train language models that can further be applied to a wide range of natural language tasks like text generation, text classification, and question answering.

Historically, language modelling was done with N-gram language models (which still have niche uses), but since the 2010s neural language models took over, and starting from the 2020s SOTA was achieved exclusively with large language models (LLMs).

A model's language modeling capability is measured using cross-entropy and perplexity. Some datasets to evaluate language modeling are WikiText-103, One Billion Word, Text8, C4, The Pile, among others.

Some notable state-of-the-art language models include:

GPT-3
BERT

Check below for all state-of-the-art models.

Here are some additional readings to go deeper on the task:

Language Modeling - Lena Voita

( Image credit: Exploring the Limits of Language Modeling )

Benchmarks

Add a Result

These leaderboards are used to track progress in Language Modelling

Dataset	Best Model	Compare
WikiText-103	RETRO (7.5B)	See all
Penn Treebank (Word Level)	GPT-3 (Zero-Shot)	See all
enwik8	GPT-2 (48 layers, h=1600)	See all
WikiText-2	SparseGPT (175B, 50% Sparsity)	See all
LAMBADA	PaLM-540B (Few-Shot)	See all
Text8	GPT-2	See all
One Billion Word	OmniNetT (Large)	See all
The Pile	GLM-130B	See all
Penn Treebank (Character Level)	Mogrifier LSTM + dynamic eval	See all
Hutter Prize	Transformer-XL + RMS dynamic eval	See all
C4	Primer	See all
Wiki-40B	FLASH-Quad-8k	See all
BIG-bench-lite	GLM-130B (3-shot)	See all
FewCLUE (EPRSTMT)	GLM-130B	See all
FewCLUE (OCNLI-FC)	GLM-130B	See all
FewCLUE (BUSTM)	GLM-130B	See all
FewCLUE (CHID-FC)	GLM-130B	See all
FewCLUE (CLUEWSC-FC)	GLM-130B	See all
CLUE (C3)	GLM-130B	See all
CLUE (WSC1.1)	GLM-130B	See all
CLUE (CMNLI)	GLM-130B	See all
CLUE (DRCD)	GLM-130B	See all
CLUE (OCNLI_50K)	GLM-130B	See all
CLUE (AFQMC)	GLM-130B	See all
CLUE (CMRC2018)	GLM-130B	See all
VietMed	Hybrid 4-gram VietMed-Train + ExtraText	See all
enwiki8	PAR Transformer 24B	See all
PTB Diagnostic ECG Database	I-DARTS	See all
Text8 dev	Transformer-LS (small)	See all
enwik8 dev	Transformer-LS (small)	See all
PubMed Cognitive Control Abstracts	Gopher	See all
DM Mathematics	Gopher	See all
Ubuntu IRC	Gopher	See all
OpenSubtitles	Gopher	See all
OpenWebtext2	Gopher	See all
HackerNews	Gopher	See all
Books3	Gopher	See all
Bookcorpus2	Gopher	See all
Pile CC	Gopher	See all
PhilPapers	Gopher	See all
Gutenberg PG-19	Gopher	See all
Arxiv HEP-TH citation graph	Gopher	See all
StackExchange	Gopher	See all
NIH ExPorter	Gopher	See all
USPTO Backgrounds	Gopher	See all
PubMed Central	Gopher	See all
FreeLaw	Gopher	See all
Curation Corpus	Gopher	See all
GitHub	Gopher	See all
100 sleep nights of 8 caregivers	Gpt3	See all
language-modeling-recommendation	GPT2	See all

Show all 51 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Language Modelling models and implementations

huggingface/transformers

30 papers

124,984

faceonlive/ai-research

29 papers

152

microsoft/unilm

12 papers

18,327

pytorch/fairseq

10 papers

29,251

See all 15 libraries.

Datasets

Subtasks

Sentence Pair Modeling

Cross-Document Language Modeling

Controllable Language Modelling

Latest papers

Most implemented Social Latest No code

Evaluating Retrieval Quality in Retrieval-Augmented Generation

alirezasalemi7/erag • 21 Apr 2024

Furthermore, evaluation of the retrieval model's performance based on query-document relevance labels shows a small correlation with the RAG system's downstream performance.

21 Apr 2024

Paper
Code

Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models

FoundationVision/Groma • • 19 Apr 2024

We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability.

131

19 Apr 2024

Paper
Code

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

templex98/mova • 19 Apr 2024

Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understanding, e. g., the CLIP vision encoder leads to outstanding results on general image understanding but poor performance on document or chart content.

19 Apr 2024

Paper
Code

LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency

damo-nlp-sg/llm-r2 • • 19 Apr 2024

In order to maintain equivalence between the rewritten query and the original one during rewriting, traditional query rewrite methods always rewrite the queries following certain rewrite rules.

19 Apr 2024

Paper
Code

FineRec:Exploring Fine-grained Sequential Recommendation

zhang-xiaokun/finerec • 19 Apr 2024

Sequential recommendation is dedicated to offering items of interest for users based on their history behaviors.

19 Apr 2024

Paper
Code

Length Generalization of Causal Transformers without Position Encoding

antnlp/nope_head_scale • • 18 Apr 2024

Generalizing to longer sentences is important for recent Transformer-based language models.

18 Apr 2024

Paper
Code

From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency

facebookresearch/multisense_consistency • 18 Apr 2024

The staggering pace with which the capabilities of large language models (LLMs) are increasing, as measured by a range of commonly used natural language understanding (NLU) benchmarks, raises many questions regarding what "understanding" means for a language model and how it compares to human understanding.

18 Apr 2024

Paper
Code

Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers

fangguo1/mcranker • • 18 Apr 2024

The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results.

18 Apr 2024

Paper
Code

AccidentBlip2: Accident Detection With Multi-View MotionBlip2

yihuajerry/accidentblip2 • • 18 Apr 2024

We also extend our approach to a multi-vehicle cooperative system by deploying Motion Qformer on each vehicle and simultaneously inputting the inference-generated query into the MLP for autoregressive inference.

18 Apr 2024

Paper
Code

VG4D: Vision-Language Model Goes 4D Video Recognition

shark0-0/vg4d • 17 Apr 2024

By transferring the knowledge of the VLM to the 4D encoder and combining the VLM, our VG4D achieves improved recognition performance.

17 Apr 2024

Paper
Code

Language Modelling

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result