2 code implementations • COLING 2022 • Kalvin Chang, Chenxuan Cui, Youngmin Kim, David R. Mortensen
Most comparative datasets of Chinese varieties are not digital; however, Wiktionary includes a wealth of transcriptions of words from these varieties.
no code implementations • 20 Feb 2024 • Ryan Soh-Eun Shim, Kalvin Chang, David R. Mortensen
Received wisdom in linguistic typology holds that if the structure of a language becomes more complex in one dimension, it will simplify in another, building on the assumption that all languages are equally complex (Joseph and Newmeyer, 2012).
1 code implementation • 2 Feb 2024 • Kalvin Chang, Nathaniel R. Robinson, Anna Cai, Ting Chen, Annie Zhang, David R. Mortensen
We describe a set of new methods to partially automate linguistic phylogenetic inference given (1) cognate sets with their respective protoforms and sound laws, (2) a mapping from phones to their articulatory features and (3) a typological database of sound changes.
no code implementations • 6 Dec 2023 • Yi-Hui Chou, Kalvin Chang, Meng-Ju Wu, Winston Ou, Alice Wen-Hsin Bi, Carol Yang, Bryan Y. Chen, Rong-Wei Pai, Po-Yen Yeh, Jo-Peng Chiang, Iu-Tshian Phoann, Winnie Chang, Chenxuan Cui, Noel Chen, Jiatong Shi
Taiwanese Hokkien is declining in use and status due to a language shift towards Mandarin in Taiwan.
1 code implementation • 4 Jul 2023 • Young Min Kim, Kalvin Chang, Chenxuan Cui, David Mortensen
We update their model with the state-of-the-art seq2seq model: the Transformer.
1 code implementation • 5 Apr 2023 • Vilém Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson, Nathaniel Robinson, Mrinmaya Sachan, David Mortensen
Mapping words into a fixed-dimensional vector space is the backbone of modern NLP.