no code implementations • 6 Nov 2023 • Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Dongjune Lee, Nam Soo Kim
We introduce a text-to-speech(TTS) framework based on a neural transducer.
no code implementations • 6 Oct 2022 • Dongjune Lee, Minchan Kim, Sung Hwan Mun, Min Hyun Han, Nam Soo Kim
For training a few-shot keyword spotting (FS-KWS) model, a large labeled dataset containing massive target keywords has known to be essential to generalize to arbitrary target keywords with only a few enrollment samples.
no code implementations • 17 Aug 2022 • Sung Hwan Mun, Min Hyun Han, Minchan Kim, Dongjune Lee, Nam Soo Kim
The experimental results show that fine-tuning with a disentanglement framework on a existing pre-trained model is valid and can further improve performance.
no code implementations • 16 Dec 2021 • Sung Hwan Mun, Min Hyun Han, Dongjune Lee, JiHwan Kim, Nam Soo Kim
In this paper, we propose self-supervised speaker representation learning strategies, which comprise of a bootstrap equilibrium speaker representation learning in the front-end and an uncertainty-aware probabilistic speaker embedding training in the back-end.