no code implementations • NAACL (ACL) 2022 • Pragaash Ponnusamy, Clint Solomon Mathialagan, Gustavo Aguilar, Chengyuan Ma, Chenlei Guo
Self-learning paradigms in large-scale conversational AI agents tend to leverage user feedback in bridging between what they say and what they mean.
no code implementations • RepL4NLP (ACL) 2022 • Md Mofijul Islam, Gustavo Aguilar, Pragaash Ponnusamy, Clint Solomon Mathialagan, Chengyuan Ma, Chenlei Guo
Additionally, the dependency on a fixed vocabulary limits the subword models' adaptability across languages and domains.
no code implementations • 19 Feb 2022 • Shuguang Chen, Gustavo Aguilar, Anirudh Srinivasan, Mona Diab, Thamar Solorio
For the unsupervised setting, we provide the following language pairs: English and Spanish-English (Eng-Spanglish), and English and Modern Standard Arabic-Egyptian Arabic (Eng-MSAEA) in both directions.
1 code implementation • EMNLP 2021 • Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio
Current work in named entity recognition (NER) shows that data augmentation techniques can produce more robust models.
no code implementations • Findings (EMNLP) 2021 • Gustavo Aguilar, Bryan McCann, Tong Niu, Nazneen Rajani, Nitish Keskar, Thamar Solorio
To alleviate these challenges, we propose a character-based subword module (char2subword) that learns the subword embedding table in pre-trained models like BERT.
1 code implementation • WNUT (ACL) 2021 • Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio
Multimodal named entity recognition (MNER) requires to bridge the gap between language understanding and visual context.
no code implementations • SEMEVAL 2020 • Parth Patwa, Gustavo Aguilar, Sudipta Kar, Suraj Pandey, Srinivas PYKL, Björn Gambäck, Tanmoy Chakraborty, Thamar Solorio, Amitava Das
In this paper, we present the results of the SemEval-2020 Task 9 on Sentiment Analysis of Code-Mixed Tweets (SentiMix 2020).
no code implementations • LREC 2020 • Gustavo Aguilar, Sudipta Kar, Thamar Solorio
To facilitate research in this direction, we propose a centralized benchmark for Linguistic Code-switching Evaluation (LinCE) that combines ten corpora covering four different code-switched language pairs (i. e., Spanish-English, Nepali-English, Hindi-English, and Modern Standard Arabic-Egyptian Arabic) and four tasks (i. e., language identification, named entity recognition, part-of-speech tagging, and sentiment analysis).
no code implementations • 8 Oct 2019 • Gustavo Aguilar, Yuan Ling, Yu Zhang, Benjamin Yao, Xing Fan, Chenlei Guo
In this paper, we propose to distill the internal representations of a large model such as BERT into a simplified version of it.
no code implementations • 11 Sep 2019 • Gustavo Aguilar, Thamar Solorio
On the other hand, the global attention spots the most relevant words in the sequence.
1 code implementation • ACL 2020 • Gustavo Aguilar, Thamar Solorio
We show the effectiveness of this transfer learning step by outperforming multilingual BERT and homologous CS-unaware ELMo models and establishing a new state of the art in CS tasks, such as NER and POS tagging.
no code implementations • EMNLP 2020 • Sudipta Kar, Gustavo Aguilar, Mirella Lapata, Thamar Solorio
This paper considers the problem of characterizing stories by inferring properties such as theme and style using written synopses and reviews of movies.
no code implementations • ACL 2019 • Gustavo Aguilar, Viktor Rozgić, Weiran Wang, Chao Wang
Studies on emotion recognition (ER) show that combining lexical and acoustic information results in more robust and accurate models.
1 code implementation • WS 2017 • Gustavo Aguilar, Suraj Maharjan, Adrian Pastor López-Monroy, Thamar Solorio
Named Entity Recognition for social media data is challenging because of its inherent noisiness.
Ranked #21 on Named Entity Recognition (NER) on WNUT 2017
no code implementations • WS 2018 • Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Mona Diab, Julia Hirschberg, Thamar Solorio
In the third shared task of the Computational Approaches to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data.
no code implementations • NAACL 2018 • Gustavo Aguilar, A. Pastor López-Monroy, Fabio A. González, Thamar Solorio
Our systems outperform the current F1 scores of the state of the art on the Workshop on Noisy User-generated Text 2017 dataset by 2. 45% and 3. 69%, establishing a more suitable approach for social media environments.
Ranked #17 on Named Entity Recognition (NER) on WNUT 2017