Using IPA-Based Tacotron for Data Efficient Cross-Lingual Speaker Adaptation and Pronunciation Enhancement

12 Nov 2020 Hamed Hemati Damian Borth

Recent neural Text-to-Speech (TTS) models have been shown to perform very well when enough data is available. However, fine-tuning them towards a new speaker or a new language is not as straight-forward in a low-resource setup... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Sigmoid Activation
Activation Functions
Highway Layer
Miscellaneous Components
Highway Network
Feedforward Networks
Batch Normalization
Normalization
Max Pooling
Pooling Operations
Griffin-Lim Algorithm
Phase Reconstruction
Dropout
Regularization
BiGRU
Bidirectional Recurrent Neural Networks
Residual Connection
Skip Connections
Dense Connections
Feedforward Networks
GRU
Recurrent Neural Networks
Tanh Activation
Activation Functions
Convolution
Convolutions
CBHG
Speech Synthesis Blocks
ReLU
Activation Functions
Residual GRU
Recurrent Neural Networks
Additive Attention
Attention Mechanisms
Tacotron
Text-to-Speech Models