no code implementations • 28 Aug 2023 • Hyungchan Yoon, ChangHwan Kim, Eunwoo Song, Hyun-Wook Yoon, Hong-Goo Kang
To this end, the baseline TTS model needs to be amply generalized to out-of-domain data (i. e., target speaker's speech).
no code implementations • 5 Jun 2023 • Hoyeon Lee, Hyun-Wook Yoon, Jong-Hwan Kim, Jae-Min Kim
We investigate the effectiveness of zero-shot and few-shot cross-lingual transfer for phrase break prediction using a pre-trained multilingual language model.
no code implementations • 30 Jun 2022 • Eunwoo Song, Ryuichi Yamamoto, Ohsung Kwon, Chan-Ho Song, Min-Jae Hwang, Suhyeon Oh, Hyun-Wook Yoon, Jin-Seob Kim, Jae-Min Kim
In the proposed method, we first adopt a variational autoencoder whose posterior distribution is utilized to extract latent features representing acoustic similarity between the recorded and synthetic corpora.
no code implementations • 21 Apr 2022 • Ryo Terashima, Ryuichi Yamamoto, Eunwoo Song, Yuma Shirahata, Hyun-Wook Yoon, Jae-Min Kim, Kentaro Tachibana
Because pitch-shift data augmentation enables the coverage of a variety of pitch dynamics, it greatly stabilizes training for both VC and TTS models, even when only 1, 000 utterances of the target speaker's neutral data are available.
no code implementations • 16 Aug 2020 • Hyun-Wook Yoon, Sang-Hoon Lee, Hyeong-Rae Noh, Seong-Whan Lee
In recent works, a flow-based neural vocoder has shown significant improvement in real-time speech generation task.