A Cycle-GAN Approach to Model Natural Perturbations in Speech for ASR Applications

Naturally introduced perturbations in audio signal, caused by emotional and physical states of the speaker, can significantly degrade the performance of Automatic Speech Recognition (ASR) systems. In this paper, we propose a front-end based on Cycle-Consistent Generative Adversarial Network (CycleGAN) which transforms naturally perturbed speech into normal speech, and hence improves the robustness of an ASR system... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Batch Normalization
Normalization
Residual Connection
Skip Connections
PatchGAN
Discriminators
ReLU
Activation Functions
Tanh Activation
Activation Functions
Residual Block
Skip Connection Blocks
Instance Normalization
Normalization
Convolution
Convolutions
Leaky ReLU
Activation Functions
Sigmoid Activation
Activation Functions
GAN Least Squares Loss
Loss Functions
Cycle Consistency Loss
Loss Functions
CycleGAN
Generative Models