no code implementations • AMTA 2016 • Sawsan Alqahtani, Mahmoud Ghoneim, Mona Diab
The absence of these diacritics naturally leads to significant word ambiguity to top the inherent ambiguity present in fully diacritized words.
no code implementations • WASSA (ACL) 2022 • Valentin Barriere, Shabnam Tafreshi, João Sedoc, Sawsan Alqahtani
This paper presents the results that were obtained from WASSA 2022 shared task on predicting empathy, emotion, and personality in reaction to news stories.
1 code implementation • 15 Nov 2023 • Sara Shatnawi, Sawsan Alqahtani, Hanan Aldarmaki
Automatic text-based diacritic restoration models generally have high diacritic error rates when applied to speech transcripts as a result of domain and style shifts in spoken language.
1 code implementation • 15 Dec 2022 • Denis Emelin, Daniele Bonadiman, Sawsan Alqahtani, Yi Zhang, Saab Mansour
Pre-trained language models (PLM) have advanced the state-of-the-art across NLP applications, but lack domain-specific knowledge that does not naturally occur in pre-training data.
no code implementations • Findings (EMNLP) 2021 • Sawsan Alqahtani, Garima Lalwani, Yi Zhang, Salvatore Romeo, Saab Mansour
Recent studies have proposed different methods to improve multilingual word representations in contextualized settings including techniques that align between source and target embedding spaces.
no code implementations • ACL 2020 • Sawsan Alqahtani, Ajay Mishra, Mona Diab
Such diacritics are often omitted in written text, increasing the number of possible pronunciations and meanings for a word.
no code implementations • IJCNLP 2019 • Sawsan Alqahtani, Ajay Mishra, Mona Diab
Diacritic restoration has gained importance with the growing need for machines to understand written texts.
no code implementations • WS 2019 • Sawsan Alqahtani, Hanan Aldarmaki, Mona Diab
Diacritic restoration could theoretically help disambiguate these words, but in practice, the increase in overall sparsity leads to performance degradation in NLP applications.
no code implementations • WS 2016 • Wajdi Zaghouani, Abdelati Hawwari, Sawsan Alqahtani, Houda Bouamor, Mahmoud Ghoneim, Mona Diab, Kemal Oflazer
Arabic writing is typically underspecified for short vowels and other markups, referred to as diacritics.
no code implementations • LREC 2016 • Wajdi Zaghouani, Houda Bouamor, Abdelati Hawwari, Mona Diab, Ossama Obeid, Mahmoud Ghoneim, Sawsan Alqahtani, Kemal Oflazer
This paper presents the annotation guidelines developed as part of an effort to create a large scale manually diacritized corpus for various Arabic text genres.