Search Results for author: Gustavo Aguilar

Found 16 papers, 4 papers with code

CALCS 2021 Shared Task: Machine Translation for Code-Switched Data

no code implementations19 Feb 2022 Shuguang Chen, Gustavo Aguilar, Anirudh Srinivasan, Mona Diab, Thamar Solorio

For the unsupervised setting, we provide the following language pairs: English and Spanish-English (Eng-Spanglish), and English and Modern Standard Arabic-Egyptian Arabic (Eng-MSAEA) in both directions.

Language Identification Machine Translation +3

Char2Subword: Extending the Subword Embedding Space Using Robust Character Compositionality

no code implementations Findings (EMNLP) 2021 Gustavo Aguilar, Bryan McCann, Tong Niu, Nazneen Rajani, Nitish Keskar, Thamar Solorio

To alleviate these challenges, we propose a character-based subword module (char2subword) that learns the subword embedding table in pre-trained models like BERT.

LinCE: A Centralized Benchmark for Linguistic Code-switching Evaluation

no code implementations LREC 2020 Gustavo Aguilar, Sudipta Kar, Thamar Solorio

To facilitate research in this direction, we propose a centralized benchmark for Linguistic Code-switching Evaluation (LinCE) that combines ten corpora covering four different code-switched language pairs (i. e., Spanish-English, Nepali-English, Hindi-English, and Modern Standard Arabic-Egyptian Arabic) and four tasks (i. e., language identification, named entity recognition, part-of-speech tagging, and sentiment analysis).

Language Identification named-entity-recognition +4

Knowledge Distillation from Internal Representations

no code implementations8 Oct 2019 Gustavo Aguilar, Yuan Ling, Yu Zhang, Benjamin Yao, Xing Fan, Chenlei Guo

In this paper, we propose to distill the internal representations of a large model such as BERT into a simplified version of it.

Knowledge Distillation

From English to Code-Switching: Transfer Learning with Strong Morphological Clues

1 code implementation ACL 2020 Gustavo Aguilar, Thamar Solorio

We show the effectiveness of this transfer learning step by outperforming multilingual BERT and homologous CS-unaware ELMo models and establishing a new state of the art in CS tasks, such as NER and POS tagging.

Language Identification NER +4

Multi-view Story Characterization from Movie Plot Synopses and Reviews

no code implementations EMNLP 2020 Sudipta Kar, Gustavo Aguilar, Mirella Lapata, Thamar Solorio

This paper considers the problem of characterizing stories by inferring properties such as theme and style using written synopses and reviews of movies.

TAG

Multimodal and Multi-view Models for Emotion Recognition

no code implementations ACL 2019 Gustavo Aguilar, Viktor Rozgić, Weiran Wang, Chao Wang

Studies on emotion recognition (ER) show that combining lexical and acoustic information results in more robust and accurate models.

Emotion Recognition MULTI-VIEW LEARNING

Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task

no code implementations WS 2018 Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Mona Diab, Julia Hirschberg, Thamar Solorio

In the third shared task of the Computational Approaches to Linguistic Code-Switching (CALCS) workshop, we focus on Named Entity Recognition (NER) on code-switched social-media data.

named-entity-recognition Named Entity Recognition +2

Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media

no code implementations NAACL 2018 Gustavo Aguilar, A. Pastor López-Monroy, Fabio A. González, Thamar Solorio

Our systems outperform the current F1 scores of the state of the art on the Workshop on Noisy User-generated Text 2017 dataset by 2. 45% and 3. 69%, establishing a more suitable approach for social media environments.

Named Entity Recognition (NER) Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.