Dialect Identification

32 papers with code • 0 benchmarks • 3 datasets

Dialectal Arabic Identification

Most implemented papers

North Sámi Dialect Identification with Self-supervised Speech Models

skakouros/sami_dialects 19 May 2023

The North S\'{a}mi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary.

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

salt-nlp/dada 22 May 2023

We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.

RoDia: A New Dataset for Romanian Dialect Identification from Speech

codrut2/rodia 6 Sep 2023

We introduce RoDia, the first dataset for Romanian dialect identification from speech.

Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification

amr-keleg/adi-under-scrutiny 20 Oct 2023

Automatic Arabic Dialect Identification (ADI) of text has gained great popularity since it was introduced in the early 2010s.

ALDi: Quantifying the Arabic Level of Dialectness of Text

amr-keleg/aldi 20 Oct 2023

Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic (DA), used in daily communications.

ArTST: Arabic Text and Speech Transformer

mbzuai-nlp/artst 25 Oct 2023

We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language.

Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data

mainlp/barner 19 Mar 2024

Named Entity Recognition (NER) is a fundamental task to extract key information from texts, but annotated resources are scarce for dialects.