We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.
We introduce an architecture to learn joint multilingual sentence representations for 93 languages, belonging to more than 30 different families and written in 28 different scripts.
CROSS-LINGUAL BITEXT MINING CROSS-LINGUAL DOCUMENT CLASSIFICATION CROSS-LINGUAL NATURAL LANGUAGE INFERENCE CROSS-LINGUAL TRANSFER DOCUMENT CLASSIFICATION JOINT MULTILINGUAL SENTENCE REPRESENTATIONS PARALLEL CORPUS MINING
However, these broad-coverage benchmarks have been mostly limited to English, and despite an increasing interest in multilingual models, a benchmark that enables the comprehensive evaluation of such methods on a diverse range of languages and tasks is still missing.
We propose a simple yet effective approach for improving Korean word representations using additional linguistic annotation (i. e. Hanja).
In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages.
We use this data set to evaluate three different cross-lingual transfer methods: (1) translating the training data, (2) using cross-lingual pre-trained embeddings, and (3) a novel method of using a multilingual machine translation encoder as contextual word representations.
In this work, we focus on the multilingual transfer setting where training data in multiple source languages is leveraged to further boost target language performance.
Ranked #9 on Cross-Lingual NER on CoNLL Dutch
Cross-lingual transfer, where a high-resource transfer language is used to improve the accuracy of a low-resource task language, is now an invaluable tool for improving performance of natural language processing (NLP) on low-resource languages.
To tackle the sentiment classification problem in low-resource languages without adequate annotated data, we propose an Adversarial Deep Averaging Network (ADAN) to transfer the knowledge learned from labeled data on a resource-rich source language to low-resource languages where only unlabeled data exists.