1 code implementation • 7 May 2021 • Yi-Chen Chen, Po-Han Chi, Shu-wen Yang, Kai-Wei Chang, Jheng-Hao Lin, Sung-Feng Huang, Da-Rong Liu, Chi-Liang Liu, Cheng-Kuang Lee, Hung-Yi Lee
The multi-task learning of a wide variety of speech processing tasks with a universal model has not been studied.
3 code implementations • 7 Apr 2021 • Jheng-Hao Lin, Yist Y. Lin, Chung-Ming Chien, Hung-Yi Lee
AUTOVC used dvector to extract speaker information, and self-supervised learning (SSL) features like wav2vec 2. 0 is used in FragmentVC to extract the phonetic content information.
1 code implementation • 6 Mar 2021 • Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-chun Hsu, Hung-Yi Lee
The few-shot multi-speaker multi-style voice cloning task is to synthesize utterances with voice and speaking style similar to a reference speaker given only a few reference samples.
no code implementations • 24 Nov 2020 • Tzu-Hsien Huang, Jheng-Hao Lin, Chien-yu Huang, Hung-Yi Lee
Voice conversion technologies have been greatly improved in recent years with the help of deep learning, but their capabilities of producing natural sounding utterances in different conditions remain unclear.
2 code implementations • 27 Oct 2020 • Yist Y. Lin, Chung-Ming Chien, Jheng-Hao Lin, Hung-Yi Lee, Lin-shan Lee
Any-to-any voice conversion aims to convert the voice from and to any speakers even unseen during training, which is much more challenging compared to one-to-one or many-to-many tasks, but much more attractive in real-world scenarios.