Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data

1 Feb 2020 Kun Zhou Berrak Sisman Haizhou Li

Emotional voice conversion aims to convert the spectrum and prosody to change the emotional patterns of speech, while preserving the speaker identity and linguistic content. Many studies require parallel speech data between different emotional patterns, which is not practical in real life... (read more)

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Batch Normalization
Normalization
Residual Connection
Skip Connections
PatchGAN
Discriminators
ReLU
Activation Functions
Tanh Activation
Activation Functions
Residual Block
Skip Connection Blocks
Instance Normalization
Normalization
Convolution
Convolutions
Leaky ReLU
Activation Functions
Sigmoid Activation
Activation Functions
GAN Least Squares Loss
Loss Functions
Cycle Consistency Loss
Loss Functions
CycleGAN
Generative Models