Unsupervised Speech Domain Adaptation Based on Disentangled Representation Learning for Robust Speech Recognition

In general, the performance of automatic speech recognition (ASR) systems is significantly degraded due to the mismatch between training and test environments. Recently, a deep-learning-based image-to-image translation technique to translate an image from a source domain to a desired domain was presented, and cycle-consistent adversarial network (CycleGAN) was applied to learn a mapping for speech-to-speech conversion from a speaker to a target speaker... (read more)

Results in Papers With Code
(↓ scroll down to see all results)