WaveGlow is a flow-based generative model that generates audio by sampling from a distribution. Specifically samples are taken from a zero mean spherical Gaussian with the same number of dimensions as our desired output, and those samples are put through a series of layers that transforms the simple distribution to one which has the desired distribution.
Source: WaveGlow: A Flow-based Generative Network for Speech SynthesisPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Speech Synthesis | 8 | 36.36% |
Density Estimation | 2 | 9.09% |
Speech Enhancement | 2 | 9.09% |
Text-To-Speech Synthesis | 1 | 4.55% |
Transliteration | 1 | 4.55% |
Style Transfer | 1 | 4.55% |
Audio Classification | 1 | 4.55% |
Face Swapping | 1 | 4.55% |
Synthetic Speech Detection | 1 | 4.55% |
Component | Type |
|
---|---|---|
Affine Coupling
|
Bijective Transformation | |
Invertible 1x1 Convolution
|
Convolutions | |
Normalizing Flows
|
Distribution Approximation |