Audio Generation

64 papers with code • 3 benchmarks • 8 datasets

Audio generation (synthesis) is the task of generating raw audio such as speech.

( Image credit: MelNet )

Most implemented papers

Fast Timing-Conditioned Latent Audio Diffusion

stability-ai/stable-audio-tools 7 Feb 2024

Generating long-form 44. 1kHz stereo audio from text prompts can be computationally demanding.

Smoothed Dilated Convolutions for Improved Dense Prediction

divelab/dilated 27 Aug 2018

Unlike existing models, which explore solutions by focusing on a block of cascaded dilated convolutional layers, our methods address the gridding artifacts by smoothing the dilated convolution itself.

Conditional WaveGAN

acheketa/cwavegan 27 Sep 2018

Generative models are successfully used for image synthesis in the recent years.

Audio inpainting of music by means of neural networks

andimarafioti/audioContextEncoder 29 Oct 2018

We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting.

Adversarial Generation of Time-Frequency Features with application in audio synthesis

tifgan/stftGAN 36th International Conference on Machine Learning 2019

We demonstrate the potential of deliberate generative TF modeling by training a generative adversarial network (GAN) on short-time Fourier features.

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

f90/Seq-U-Net 14 Nov 2019

In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.

Music Source Separation in the Waveform Domain

facebookresearch/demucs 27 Nov 2019

Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song.

Score and Lyrics-Free Singing Voice Generation

ciaua/score_lyrics_free_svg 26 Dec 2019

Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.

Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization

ciaua/unagan 18 May 2020

Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.

Perceiving Music Quality with GANs

carlthome/pmqd 11 Jun 2020

By using the human rated dataset we show that the discriminator score correlates significantly with the subjective ratings, suggesting that the proposed method can be used to create a no-reference musical audio quality assessment measure.