1 code implementation • 12 Jan 2024 • Wonjune Kang, Yun Wang, Shun Zhang, Arthur Hinsvark, Qing He
We propose a multi-task learning (MTL) model for jointly performing three tasks that are commonly solved in a text-to-speech (TTS) front-end: text normalization (TN), part-of-speech (POS) tagging, and homograph disambiguation (HD).
no code implementations • 21 Apr 2021 • Arthur Hinsvark, Natalie Delworth, Miguel Del Rio, Quinten McNamara, Joshua Dong, Ryan Westerman, Michelle Huang, Joseph Palakapilly, Jennifer Drexler, Ilya Pirkin, Nishchal Bhandari, Miguel Jette
Automatic Speech Recognition (ASR) systems generalize poorly on accented speech.