1 code implementation • 12 Jan 2024 • Wonjune Kang, Yun Wang, Shun Zhang, Arthur Hinsvark, Qing He
We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech (TTS) front-end: text normalization (TN), part-of-speech (POS) tagging, and homograph disambiguation (HD).
1 code implementation • 23 May 2023 • William Brannon, Suyash Fulay, Hang Jiang, Wonjune Kang, Brandon Roy, Jad Kabbara, Deb Roy
We propose ConGraT(Contrastive Graph-Text pretraining), a general, self-supervised method for jointly learning separate representations of texts and nodes in a parent (or ``supervening'') graph, where each text is associated with one of the nodes.
1 code implementation • 19 May 2022 • Wonjune Kang, Mark Hasegawa-Johnson, Deb Roy
Zero-shot voice conversion is becoming an increasingly popular research topic, as it promises the ability to transform speech to sound like any speaker.