Audio Generation

64 papers with code • 3 benchmarks • 8 datasets

Audio generation (synthesis) is the task of generating raw audio such as speech.

( Image credit: MelNet )

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Generation

Dataset	Best Model	Compare
AudioCaps	Audiobox	See all
Classical music, 5 seconds at 12 kHz	Sparse Transformer 152M (strided)	See all
Symphony music	SymphonyNet	See all

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Fast Timing-Conditioned Latent Audio Diffusion

stability-ai/stable-audio-tools • • 7 Feb 2024

Generating long-form 44. 1kHz stereo audio from text prompts can be computationally demanding.

Paper
Code

Smoothed Dilated Convolutions for Improved Dense Prediction

divelab/dilated • • 27 Aug 2018

Unlike existing models, which explore solutions by focusing on a block of cascaded dilated convolutional layers, our methods address the gridding artifacts by smoothing the dilated convolution itself.

Paper
Code

Conditional WaveGAN

acheketa/cwavegan • • 27 Sep 2018

Generative models are successfully used for image synthesis in the recent years.

Paper
Code

Audio inpainting of music by means of neural networks

andimarafioti/audioContextEncoder • • 29 Oct 2018

We studied the ability of deep neural networks (DNNs) to restore missing audio content based on its context, a process usually referred to as audio inpainting.

Paper
Code

Adversarial Generation of Time-Frequency Features with application in audio synthesis

tifgan/stftGAN • • 36th International Conference on Machine Learning 2019

We demonstrate the potential of deliberate generative TF modeling by training a generative adversarial network (GAN) on short-time Fourier features.

Paper
Code

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

f90/Seq-U-Net • • 14 Nov 2019

In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.

Paper
Code

Music Source Separation in the Waveform Domain

facebookresearch/demucs • • 27 Nov 2019

Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song.

Paper
Code

Score and Lyrics-Free Singing Voice Generation

ciaua/score_lyrics_free_svg • • 26 Dec 2019

Generative models for singing voice have been mostly concerned with the task of ``singing voice synthesis,'' i. e., to produce singing voice waveforms given musical scores and text lyrics.

Paper
Code

Unconditional Audio Generation with Generative Adversarial Networks and Cycle Regularization

ciaua/unagan • • 18 May 2020

Audio examples, as well as the code for implementing our model, will be publicly available online upon paper publication.

Paper
Code

Perceiving Music Quality with GANs

carlthome/pmqd • • 11 Jun 2020

By using the human rated dataset we show that the discriminator score correlates significantly with the subjective ratings, suggesting that the proposed method can be used to create a no-reference musical audio quality assessment measure.

Paper
Code

Audio Generation

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result