no code implementations • 2 Jan 2024 • Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed
The findings of this study contribute to advancing NLP research in low-resource settings, enabling greater accessibility and inclusion for African languages in a rapidly expanding digital landscape.
no code implementations • 24 Oct 2023 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, Chiyu Zhang, El Moatez Billah Nagoudi, Houda Bouamor, Nizar Habash
We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023).
no code implementations • 24 Oct 2023 • AbdelRahim Elmadany, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed
While many researchers have proposed models and solutions for individual problems, there is an acute shortage of a comprehensive Arabic natural language generation toolkit that is capable of handling a wide range of tasks.
no code implementations • 24 Oct 2023 • Mustafa Jarrar, Muhammad Abdul-Mageed, Mohammed Khalilia, Bashar Talafha, AbdelRahim Elmadany, Nagham Hamad, Alaa' Omar
The winning teams achieved F1 scores of 91. 96 and 93. 73 in FlatNER and NestedNER, respectively.
no code implementations • 17 Oct 2023 • Abdul Waheed, Bashar Talafha, Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed
We train a wide range of models such as HuBERT (DID), Whisper, and XLS-R (ASR) in a supervised setting for Arabic DID and ASR tasks.
no code implementations • 1 Jun 2023 • Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed
As these pipelines require application of ADI tools to potentially out-of-domain data, we aim to investigate how vulnerable the tools may be to this domain shift.
no code implementations • 24 May 2023 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Ahmed El-Shangiti, Muhammad Abdul-Mageed
We present Dolphin, a novel benchmark that addresses the need for a natural language generation (NLG) evaluation framework dedicated to the wide collection of Arabic languages and varieties.
no code implementations • 21 Apr 2023 • Gagan Bhatia, Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed
We describe our contribution to the SemEVAl 2023 AfriSenti-SemEval shared task, where we tackle the task of sentiment analysis in 14 different African languages.
no code implementations • 21 Dec 2022 • AbdelRahim Elmadany, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed
Due to their crucial role in all NLP, several benchmarks have been proposed to evaluate pretrained language models.
no code implementations • 21 Dec 2022 • Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Alcoba Inciarte
Multilingual pretrained language models (mPLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning.
no code implementations • 21 Dec 2022 • El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, AbdelRahim Elmadany, Alcides Alcoba Inciarte, Md Tawkat Islam Khondaker
Scholarship on generative pretraining (GPT) remains acutely Anglocentric, leaving serious gaps in our understanding of the whole class of autoregressive models.
1 code implementation • 22 Oct 2022 • Md Tawkat Islam Khondaker, El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan
Contrastive learning (CL) brought significant progress to various NLP tasks.
1 code implementation • 21 Oct 2022 • Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Alcoba Inciarte
Problematically, most of the world's 7000+ languages today are not covered by LID technologies.
1 code implementation • 18 Oct 2022 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash
We describe findings of the third Nuanced Arabic Dialect Identification Shared Task (NADI 2022).
1 code implementation • OSACT (LREC) 2022 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed
We present TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).
1 code implementation • ACL 2022 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed
For evaluation, we introduce a novel benchmark for ARabic language GENeration (ARGEN), covering seven important tasks.
no code implementations • ACL 2021 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi
To evaluate our models, we also introduce ARLUE, a new benchmark for multi-dialectal Arabic language understanding evaluation.
no code implementations • NAACL (CALCS) 2021 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed
Our work is in the context of the Shared Task on Machine Translation in Code-Switching.
1 code implementation • EACL (WANLP) 2021 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash
This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1. 1), country-level dialect identification (Subtask 1. 2), province-level MSA identification (Subtask 2. 1), and province-level sub-dialect identification (Subtask 2. 2).
2 code implementations • 27 Dec 2020 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi
To evaluate our models, we also introduce ARLUE, a new benchmark for multi-dialectal Arabic language understanding evaluation.
1 code implementation • EACL (WANLP) 2021 • Muhammad Abdul-Mageed, Shady Elbassuoni, Jad Doughman, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Yorgo Zoughby, Ahmad Shaher, Iskander Gaba, Ahmed Helal, Mohammed El-Razzaz
We describe DiaLex, a benchmark for intrinsic evaluation of dialectal Arabic word embedding.
1 code implementation • COLING (WANLP) 2020 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Tariq Alhindi, Hasan Cavusoglu
Finally, we develop the first models for detecting manipulated Arabic news and achieve state-of-the-art results on Arabic fake news detection (macro F1=70. 06).
1 code implementation • EMNLP 2020 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Lyle Ungar
Although the prediction of dialects is an important language processing task, with a wide range of applications, existing work is largely limited to coarse-grained varieties.
no code implementations • LREC 2020 • AbdelRahim Elmadany, Chiyu Zhang, Muhammad Abdul-Mageed, Azadeh Hashemi
Social media are pervasive in our life, making it necessary to ensure safe online experiences by detecting and removing offensive and hate speech.
1 code implementation • EACL 2021 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Dinesh Pabbi, Kunal Verma, Rannie Lin
We describe Mega-COV, a billion-scale dataset from Twitter for studying COVID-19.
no code implementations • 2 Nov 2019 • Muhammad Abdul-Mageed, Chiyu Zhang, Arun Rajendran, AbdelRahim Elmadany, Michael Przystupa, Lyle Ungar
In this work we exploit a newly-created Arabic dataset with ground truth age and gender labels to learn these attributes both individually and in a multi-task setting at the sentence level.
no code implementations • 31 Oct 2019 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Arun Rajendran, Lyle Ungar
Prediction of language varieties and dialects is an important language processing task, with a wide range of applications.
no code implementations • WS 2019 • Bushra Algotiml, AbdelRahim Elmadany, Walid Magdy
Speech acts are the actions that a speaker intends when performing an utterance within conversations.
no code implementations • LREC 2018 • AbdelRahim Elmadany, Sherif Abdou, Mervat Gheith
The ability to model and automatically detect dialogue act is an important step toward understanding spontaneous speech and Instant Messages.