Dialect Identification
32 papers with code • 0 benchmarks • 3 datasets
Dialectal Arabic Identification
Benchmarks
These leaderboards are used to track progress in Dialect Identification
Most implemented papers
North Sámi Dialect Identification with Self-supervised Speech Models
The North S\'{a}mi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary.
DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules
We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.
RoDia: A New Dataset for Romanian Dialect Identification from Speech
We introduce RoDia, the first dataset for Romanian dialect identification from speech.
Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification
Automatic Arabic Dialect Identification (ADI) of text has gained great popularity since it was introduced in the early 2010s.
ALDi: Quantifying the Arabic Level of Dialectness of Text
Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic (DA), used in daily communications.
ArTST: Arabic Text and Speech Transformer
We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language.
Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data
Named Entity Recognition (NER) is a fundamental task to extract key information from texts, but annotated resources are scarce for dialects.