Generative Audio Models

WaveGrad

Introduced by Chen et al. in WaveGrad: Estimating Gradients for Waveform Generation

WaveGrad is a conditional model for waveform generation through estimating gradients of the data density. This model is built on the prior work on score matching and diffusion probabilistic models. It starts from Gaussian white noise and iteratively refines the signal via a gradient-based sampler conditioned on the mel-spectrogram. WaveGrad is non-autoregressive, and requires only a constant number of generation steps during inference. It can use as few as 6 iterations to generate high fidelity audio samples.

Source: WaveGrad: Estimating Gradients for Waveform Generation

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Speech Synthesis 5 45.45%
Image Generation 2 18.18%
Denoising 2 18.18%
Text-To-Speech Synthesis 2 18.18%

Categories