Search Results for author: Houda Bouamor

Found 50 papers, 6 papers with code

Paper
Add Code

Gender-Aware Reinflection using Linguistically Enhanced Neural Models

1 code implementation • GeBNLP (COLING) 2020 • Bashar Alhafni, Nizar Habash, Houda Bouamor

In this paper, we present an approach for sentence-level gender reinflection using linguistically enhanced sequence-to-sequence models.

Grammatical Error Correction Sentence

Paper
Code

Hierarchical Aggregation of Dialectal Data for Arabic Dialect Identification

no code implementations • LREC 2022 • Nurpeiis Baimukan, Houda Bouamor, Nizar Habash

We test the value of such aggregation by building language models and using them in dialect identification.

Dialect Identification

Paper
Add Code

Chinese Offensive Language Detection:Current Status and Future Directions

no code implementations • 27 Mar 2024 • Yunze Xiao, Houda Bouamor, Wajdi Zaghouani

Despite the considerable efforts being made to monitor and regulate user-generated content on social media platforms, the pervasiveness of offensive language, such as hate speech or cyberbullying, in the digital space remains a significant challenge.

Paper
Add Code

Cross-Lingual Transfer from Related Languages: Treating Low-Resource Maltese as Multilingual Code-Switching

1 code implementation • 30 Jan 2024 • Kurt Micallef, Nizar Habash, Claudia Borg, Fadhl Eryani, Houda Bouamor

Although multilingual language models exhibit impressive cross-lingual transfer capabilities on unseen languages, the performance on downstream tasks is impacted when there is a script disparity with the languages used in the multilingual model's pre-training data.

Cross-Lingual Transfer Transliteration

Paper
Code

NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task

no code implementations • 24 Oct 2023 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, Chiyu Zhang, El Moatez Billah Nagoudi, Houda Bouamor, Nizar Habash

We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023).

Dialect Identification Machine Translation +1

Paper
Add Code

The Shared Task on Gender Rewriting

no code implementations • 22 Oct 2022 • Bashar Alhafni, Nizar Habash, Houda Bouamor, Ossama Obeid, Sultan Alrowili, Daliyah AlZeer, Khawlah M. Alshanqiti, Ahmed ElBakry, Muhammad ElNokrashy, Mohamed Gabr, Abderrahmane Issam, Abdelrahim Qaddoumi, K. Vijay-Shanker, Mahmoud Zyate

In this paper, we present the results and findings of the Shared Task on Gender Rewriting, which was organized as part of the Seventh Arabic Natural Language Processing Workshop.

Sentence

Paper
Add Code

NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

1 code implementation • 18 Oct 2022 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash

We describe findings of the third Nuanced Arabic Dialect Identification Shared Task (NADI 2022).

Dialect Identification Sentiment Analysis +1

Paper
Code

User-Centric Gender Rewriting

1 code implementation • NAACL 2022 • Bashar Alhafni, Nizar Habash, Houda Bouamor

In this paper, we define the task of gender rewriting in contexts involving two users (I and/or You) - first and second grammatical persons with independent grammatical gender preferences.

Paper
Code

The Arabic Parallel Gender Corpus 2.0: Extensions and Analyses

no code implementations • LREC 2022 • Bashar Alhafni, Nizar Habash, Houda Bouamor

Much of the research on this issue has focused on mitigating gender bias in English NLP models and systems.

Machine Translation Text Generation +1

Paper
Add Code

The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models

1 code implementation • EACL (WANLP) 2021 • Go Inoue, Bashar Alhafni, Nurpeiis Baimukan, Houda Bouamor, Nizar Habash

In this paper, we explore the effects of language variants, data sizes, and fine-tuning task types in Arabic pre-trained language models.

Language Modelling

Paper
Code

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

1 code implementation • EACL (WANLP) 2021 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash

This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1. 1), country-level dialect identification (Subtask 1. 2), province-level MSA identification (Subtask 2. 1), and province-level sub-dialect identification (Subtask 2. 2).

Dialect Identification

Paper
Code

A Panoramic Survey of Natural Language Processing in the Arab World

no code implementations • 25 Nov 2020 • Kareem Darwish, Nizar Habash, Mourad Abbas, Hend Al-Khalifa, Huseein T. Al-Natsheh, Samhaa R. El-Beltagy, Houda Bouamor, Karim Bouzoubaa, Violetta Cavalli-Sforza, Wassim El-Hajj, Mustafa Jarrar, Hamdy Mubarak

The term natural language refers to any system of symbolic communication (spoken, signed or written) without intentional human planning and design.

Machine Translation Optical Character Recognition +6

Paper
Add Code

NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task

no code implementations • COLING (WANLP) 2020 • Muhammad Abdul-Mageed, Chiyu Zhang, Houda Bouamor, Nizar Habash

The data for the shared task covers a total of 100 provinces from 21 Arab countries and are collected from the Twitter domain.

Dialect Identification

Paper
Add Code

A Spelling Correction Corpus for Multiple Arabic Dialects

no code implementations • LREC 2020 • Fadhl Eryani, Nizar Habash, Houda Bouamor, Salam Khalifa

In this paper, we present the MADAR CODA Corpus, a collection of 10, 000 sentences from five Arabic city dialects (Beirut, Cairo, Doha, Rabat, and Tunis) represented in the Conventional Orthography for Dialectal Arabic (CODA) in parallel with their raw original form.

Spelling Correction

Paper
Add Code

A Little Linguistics Goes a Long Way: Unsupervised Segmentation with Limited Language Specific Guidance

no code implementations • WS 2019 • Alex Erdmann, er, Salam Khalifa, Mai Oudah, Nizar Habash, Houda Bouamor

We present de-lexical segmentation, a linguistically motivated alternative to greedy or other unsupervised methods, requiring only minimal language specific input.

Paper
Add Code

The MADAR Shared Task on Arabic Fine-Grained Dialect Identification

no code implementations • WS 2019 • Houda Bouamor, Sabit Hassan, Nizar Habash

In this paper, we present the results and findings of the MADAR Shared Task on Arabic Fine-Grained Dialect Identification.

Dialect Identification

Paper
Add Code

Automatic Gender Identification and Reinflection in Arabic

no code implementations • WS 2019 • Nizar Habash, Houda Bouamor, Christine Chung

The impressive progress in many Natural Language Processing (NLP) applications has increased the awareness of some of the biases these NLP systems have with regards to gender identities.

Machine Translation Translation

Paper
Add Code

The FinSBD-2019 Shared Task: Sentence Boundary Detection in PDF Noisy Text in the Financial Domain

no code implementations • WS 2019 • Abderrahim Ait Azzi, Houda Bouamor, Sira Ferradans

Boundary Detection Sentence

Paper
Add Code

ADIDA: Automatic Dialect Identification for Arabic

no code implementations • NAACL 2019 • Ossama Obeid, Mohammad Salameh, Houda Bouamor, Nizar Habash

This demo paper describes ADIDA, a web-based system for automatic dialect identification for Arabic text.

Dialect Identification

Paper
Add Code

MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction

no code implementations • LREC 2018 • Ossama Obeid, Salam Khalifa, Nizar Habash, Houda Bouamor, Wajdi Zaghouani, Kemal Oflazer

In this paper, we introduce MADARi, a joint morphological annotation and spelling correction system for texts in Standard and Dialectal Arabic.

Dialect Identification LEMMA +2

Paper
Add Code

Fine-Grained Arabic Dialect Identification

no code implementations • COLING 2018 • Mohammad Salameh, Houda Bouamor, Nizar Habash

Previous work on the problem of Arabic Dialect Identification typically targeted coarse-grained five dialect classes plus Standard Arabic (6-way classification).

Classification Dialect Identification +3

Paper
Add Code

Unified Guidelines and Resources for Arabic Dialect Orthography

no code implementations • LREC 2018 • Nizar Habash, Fadhl Eryani, Salam Khalifa, Owen Rambow, Dana Abdulrahim, Alex Erdmann, er, Reem Faraj, Wajdi Zaghouani, Houda Bouamor, Nasser Zalmout, Sara Hassan, Faisal Al-Shargi, Sakhar Alkhereyf, Basma Abdulkareem, Esk, Ramy er, Mohammad Salameh, Hind Saddiki

Speech Recognition Transliteration

Paper
Add Code

The MADAR Arabic Dialect Corpus and Lexicon

no code implementations • LREC 2018 • Houda Bouamor, Nizar Habash, Mohammad Salameh, Wajdi Zaghouani, Owen Rambow, Dana Abdulrahim, Ossama Obeid, Salam Khalifa, Fadhl Eryani, Alex Erdmann, er, Kemal Oflazer

Transliteration

Paper
Add Code

Low Resourced Machine Translation via Morpho-syntactic Modeling: The Case of Dialectal Arabic

no code implementations • MTSummit 2017 • Alexander Erdmann, Nizar Habash, Dima Taji, Houda Bouamor

We present the second ever evaluated Arabic dialect-to-dialect machine translation effort, and the first to leverage external resources beyond a small parallel corpus.

Machine Translation Translation

Paper
Add Code

Machine Translation Evaluation for Arabic using Morphologically-enriched Embeddings

no code implementations • COLING 2016 • Francisco Guzm{\'a}n, Houda Bouamor, Ramy Baly, Nizar Habash

Evaluation of machine translation (MT) into morphologically rich languages (MRL) has not been well studied despite posing many challenges.

Community Question Answering Machine Translation +6

Paper
Add Code

Using Ambiguity Detection to Streamline Linguistic Annotation

no code implementations • WS 2016 • Wajdi Zaghouani, Abdelati Hawwari, Sawsan Alqahtani, Houda Bouamor, Mahmoud Ghoneim, Mona Diab, Kemal Oflazer

Arabic writing is typically underspecified for short vowels and other markups, referred to as diacritics.

Automatic Speech Recognition (ASR) Machine Translation

Paper
Add Code

Eyes Don't Lie: Predicting Machine Translation Quality Using Eye Movement

no code implementations • NAACL 2016 • Hassan Sajjad, Francisco Guzm{\'a}n, Nadir Durrani, Ahmed Abdelali, Houda Bouamor, Irina Temnikova, Stephan Vogel

Machine Translation Translation

Paper
Add Code

DALILA: The Dialectal Arabic Linguistic Learning Assistant

no code implementations • LREC 2016 • Salam Khalifa, Houda Bouamor, Nizar Habash

Dialectal Arabic (DA) poses serious challenges for Natural Language Processing (NLP).

Paper
Add Code

Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation

no code implementations • LREC 2016 • Wajdi Zaghouani, Nizar Habash, Ossama Obeid, Behrang Mohit, Houda Bouamor, Kemal Oflazer

We present our guidelines and annotation procedure to create a human corrected machine translated post-edited corpus for the Modern Standard Arabic.

Machine Translation Translation

Paper
Add Code

Guidelines and Framework for a Large Scale Arabic Diacritized Corpus

no code implementations • LREC 2016 • Wajdi Zaghouani, Houda Bouamor, Abdelati Hawwari, Mona Diab, Ossama Obeid, Mahmoud Ghoneim, Sawsan Alqahtani, Kemal Oflazer

This paper presents the annotation guidelines developed as part of an effort to create a large scale manually diacritized corpus for various Arabic text genres.