Language Modelling
4482 papers with code • 51 benchmarks • 157 datasets
Language Modeling is the task of predicting the next word or character in a document. This technique can be used to train language models that can further be applied to a wide range of natural language tasks like text generation, text classification, and question answering.
Historically, language modelling was done with N-gram language models (which still have niche uses), but since the 2010s neural language models took over, and starting from the 2020s SOTA was achieved exclusively with large language models (LLMs).
A model's language modeling capability is measured using cross-entropy and perplexity. Some datasets to evaluate language modeling are WikiText-103, One Billion Word, Text8, C4, The Pile, among others.
Some notable state-of-the-art language models include:
Check below for all state-of-the-art models.
Here are some additional readings to go deeper on the task:
- Language Modeling - Lena Voita
( Image credit: Exploring the Limits of Language Modeling )
Libraries
Use these libraries to find Language Modelling models and implementationsDatasets
Subtasks
Latest papers
Evaluating Retrieval Quality in Retrieval-Augmented Generation
Furthermore, evaluation of the retrieval model's performance based on query-document relevance labels shows a small correlation with the RAG system's downstream performance.
Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models
We introduce Groma, a Multimodal Large Language Model (MLLM) with grounded and fine-grained visual perception ability.
MoVA: Adapting Mixture of Vision Experts to Multimodal Context
Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understanding, e. g., the CLIP vision encoder leads to outstanding results on general image understanding but poor performance on document or chart content.
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency
In order to maintain equivalence between the rewritten query and the original one during rewriting, traditional query rewrite methods always rewrite the queries following certain rewrite rules.
FineRec:Exploring Fine-grained Sequential Recommendation
Sequential recommendation is dedicated to offering items of interest for users based on their history behaviors.
Length Generalization of Causal Transformers without Position Encoding
Generalizing to longer sentences is important for recent Transformer-based language models.
From Form(s) to Meaning: Probing the Semantic Depths of Language Models Using Multisense Consistency
The staggering pace with which the capabilities of large language models (LLMs) are increasing, as measured by a range of commonly used natural language understanding (NLU) benchmarks, raises many questions regarding what "understanding" means for a language model and how it compares to human understanding.
Generating Diverse Criteria On-the-Fly to Improve Point-wise LLM Rankers
The most recent pointwise Large Language Model (LLM) rankers have achieved remarkable ranking results.
AccidentBlip2: Accident Detection With Multi-View MotionBlip2
We also extend our approach to a multi-vehicle cooperative system by deploying Motion Qformer on each vehicle and simultaneously inputting the inference-generated query into the MLP for autoregressive inference.
VG4D: Vision-Language Model Goes 4D Video Recognition
By transferring the knowledge of the VLM to the 4D encoder and combining the VLM, our VG4D achieves improved recognition performance.