Audio Generation
64 papers with code • 3 benchmarks • 8 datasets
Audio generation (synthesis) is the task of generating raw audio such as speech.
( Image credit: MelNet )
Most implemented papers
Fast Timing-Conditioned Latent Audio Diffusion
Generating long-form 44. 1kHz stereo audio from text prompts can be computationally demanding.
Smoothed Dilated Convolutions for Improved Dense Prediction
Unlike existing models, which explore solutions by focusing on a block of cascaded dilated convolutional layers, our methods address the gridding artifacts by smoothing the dilated convolution itself.
Conditional WaveGAN
Generative models are successfully used for image synthesis in the recent years.
Audio inpainting of music by means of neural networks
We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting.
Adversarial Generation of Time-Frequency Features with application in audio synthesis
We demonstrate the potential of deliberate generative TF modeling by training a generative adversarial network (GAN) on short-time Fourier features.
Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling
In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.
Music Source Separation in the Waveform Domain
Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song.
Score and Lyrics-Free Singing Voice Generation
Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.
Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization
Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.
Perceiving Music Quality with GANs
By using the human rated dataset we show that the discriminator score correlates significantly with the subjective ratings, suggesting that the proposed method can be used to create a no-reference musical audio quality assessment measure.