Unsupervised Cross-Domain Speech-to-Speech Conversion with Time-Frequency Consistency

In recent years generative adversarial network (GAN) based models have been successfully applied for unsupervised speech-to-speech conversion.The rich compact harmonic view of the magnitude spectrogram is considered a suitable choice for training these models with audio data. To reconstruct the speech signal first a magnitude spectrogram is generated by the neural network, which is then utilized by methods like the Griffin-Lim algorithm to reconstruct a phase spectrogram... (read more)

PDF Abstract
No code implementations yet. Submit your code now



Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper

Griffin-Lim Algorithm
Phase Reconstruction