Wiktionary Normalization of Translations and Morphological Information
We extend the Yawipa Wiktionary Parser (Wu and Yarowsky, 2020) to extract and normalize translations from etymology glosses, and morphological form-of relations, resulting in 300K unique translations and over 4 million instances of 168 annotated morphological relations. We propose a method to identify typos in translation annotations. Using the extracted morphological data, we develop multilingual neural models for predicting three types of word formation{---}clipping, contraction, and eye dialect{---}and improve upon a standard attention baseline by using copy attention.
PDF Abstract