Natural Language Processing

Dialect Identification

32 papers with code • 0 benchmarks • 3 datasets

Dialectal Arabic Identification

Benchmarks

Add a Result

These leaderboards are used to track progress in Dialect Identification

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Most implemented papers

Most implemented Social Latest No code

North Sámi Dialect Identification with Self-supervised Speech Models

skakouros/sami_dialects • • 19 May 2023

The North S\'{a}mi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary.

Paper
Code

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

salt-nlp/dada • • 22 May 2023

We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.

Paper
Code

RoDia: A New Dataset for Romanian Dialect Identification from Speech

codrut2/rodia • 6 Sep 2023

We introduce RoDia, the first dataset for Romanian dialect identification from speech.

Paper
Code

Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification

amr-keleg/adi-under-scrutiny • • 20 Oct 2023

Automatic Arabic Dialect Identification (ADI) of text has gained great popularity since it was introduced in the early 2010s.

Paper
Code

ALDi: Quantifying the Arabic Level of Dialectness of Text

amr-keleg/aldi • 20 Oct 2023

Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic (DA), used in daily communications.

Paper
Code

ArTST: Arabic Text and Speech Transformer

mbzuai-nlp/artst • • 25 Oct 2023

We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language.

Paper
Code

Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data

mainlp/barner • 19 Mar 2024

Named Entity Recognition (NER) is a fundamental task to extract key information from texts, but annotated resources are scarce for dialects.

Paper
Code

Dialect Identification

Benchmarks Add a Result

Datasets

Most implemented papers

North Sámi Dialect Identification with Self-supervised Speech Models

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

RoDia: A New Dataset for Romanian Dialect Identification from Speech

Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification

ALDi: Quantifying the Arabic Level of Dialectness of Text

ArTST: Arabic Text and Speech Transformer

Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data

Content

Benchmarks

Add a Result