Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

12 May 2020Rafael ValleKevin ShihRyan PrengerBryan Catanzaro

In this paper we propose Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis with control over speech variation and style transfer. Flowtron borrows insights from IAF and revamps Tacotron in order to provide high-quality and expressive mel-spectrogram synthesis... (read more)

PDF Abstract

Results from the Paper


 Ranked #1 on Text-To-Speech Synthesis on LJSpeech (Pleasantness MOS metric)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Text-To-Speech Synthesis LJSpeech Tacotron 2 Pleasantness MOS 3.521 # 2
Text-To-Speech Synthesis LJSpeech Flowtron Pleasantness MOS 3.665 # 1

Methods used in the Paper