no code implementations • 17 Dec 2020 • Jonas Rohnke, Tom Merritt, Jaime Lorenzo-Trueba, Adam Gabrys, Vatsal Aggarwal, Alexis Moinet, Roberto Barra-Chicote
In this paper we investigate the use of a sentence-level conditioning vector to improve the signal quality of a Parallel WaveNet neural vocoder.
no code implementations • 4 Feb 2020 • Henry B. Moss, Vatsal Aggarwal, Nishant Prateek, Javier González, Roberto Barra-Chicote
We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation.
no code implementations • 28 Nov 2019 • Vatsal Aggarwal, Marius Cotescu, Nishant Prateek, Jaime Lorenzo-Trueba, Roberto Barra-Chicote
We propose a Text-to-Speech method to create an unseen expressive style using one utterance of expressive speech of around one second.
1 code implementation • 4 Jul 2019 • Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal
This vocoder is shown to be capable of generating speech of consistently good quality (98% relative mean MUSHRA when compared to natural speech) regardless of whether the input spectrogram comes from a speaker or style seen during training or from an out-of-domain scenario when the recording conditions are studio-quality.