1 code implementation • 21 Nov 2023 • Sang-Hoon Lee, Ha-Yeong Choi, Seung-bin Kim, Seong-Whan Lee
Furthermore, we significantly improve the naturalness and speaker similarity of synthetic speech even in zero-shot speech synthesis scenarios.
1 code implementation • 8 Nov 2023 • Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee
Finally, by using the masked prior in diffusion models, our model can improve the speaker adaptation quality.
no code implementations • 30 Jul 2023 • Sang-Hoon Lee, Ha-Yeong Choi, Hyung-Seok Oh, Seong-Whan Lee
With a hierarchical adaptive structure, the model can adapt to a novel voice style and convert speech progressively.
1 code implementation • 25 May 2023 • Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee
To address the above problem, this paper presents decoupled denoising diffusion models (DDDMs) with disentangled representations, which can control the style for each attribute in generative models.