We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.
Ranked #9 on Question Answering on SQuAD1.1 dev (F1 metric)
We show that the use of web crawled data is preferable to the use of Wikipedia data.
Ranked #1 on Part-Of-Speech Tagging on French GSD
We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.
We present a new problem: grounding natural language instructions to mobile user interface actions, and create three new datasets for it.
We introduce Stanza, an open-source Python natural language processing toolkit supporting 66 human languages.
Subword segmentation is widely used to address the open vocabulary problem in machine translation.
Ranked #1 on Machine Translation on IWSLT2015 English-Vietnamese
We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models.
However, due to limited data resources from downstream tasks and the extremely large capacity of pre-trained models, aggressive fine-tuning often causes the adapted model to overfit the data of downstream tasks and forget the knowledge of the pre-trained model.
In particular, the prediction of aspect-sentiment pairs is converted into multi-label classification, aiming to capture the dependency between words in a pair.
We introduce jiant, an open source toolkit for conducting multitask and transfer learning experiments on English NLU tasks.