TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Music Source Separation	MUSDB18	STL2	SDR (vocals)	3.25	# 27
Music Source Separation	MUSDB18	STL2	SDR (drums)	4.22	# 27
Music Source Separation	MUSDB18	STL2	SDR (other)	2.25	# 25
Music Source Separation	MUSDB18	STL2	SDR (bass)	3.21	# 26
Music Source Separation	MUSDB18	STL2	SDR (avg)	3.23	# 27

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/wave-u-net-a-multi-scale-neural-network-for/music-source-separation-on-musdb18)](https://paperswithcode.com/sota/music-source-separation-on-musdb18?p=wave-u-net-a-multi-scale-neural-network-for)`

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

8 Jun 2018 · Daniel Stoller, Sebastian Ewert, Simon Dixon ·

Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end. Therefore, we investigate end-to-end source separation in the time-domain, which allows modelling phase information and avoids fixed spectral transformations. Due to high sampling rates for audio, employing a long temporal input context on the sample level is difficult, but required for high quality separation results because of long-range temporal correlations. In this context, we propose the Wave-U-Net, an adaptation of the U-Net to the one-dimensional time domain, which repeatedly resamples feature maps to compute and combine features at different time scales. We introduce further architectural improvements, including an output layer that enforces source additivity, an upsampling technique and a context-aware prediction framework to reduce output artifacts. Experiments for singing voice separation indicate that our architecture yields a performance comparable to a state-of-the-art spectrogram-based U-Net architecture, given the same data. Finally, we reveal a problem with outliers in the currently used SDR evaluation metrics and suggest reporting rank-based statistics to alleviate this problem.

PDF Abstract

Code

Add Remove Mark official

f90/Wave-U-Net official

790

haoxiangsnr/Wave-U-Net-for-Speech-E…

305

f90/Wave-U-Net-Pytorch

280

craigmacartney/Wave-U-Net-For-Speec…

205

ShichengChen/WaveUNet

See all 9 implementations

Tasks

Add Remove

Audio Source Separation

Music Source Separation

Datasets

MUSDB18

Results from the Paper

Edit

Ranked #27 on Music Source Separation on MUSDB18

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Music Source Separation	MUSDB18	STL2	SDR (vocals)	3.25	# 27	Compare
			SDR (drums)	4.22	# 27	Compare
			SDR (other)	2.25	# 25	Compare
			SDR (bass)	3.21	# 26	Compare
			SDR (avg)	3.23	# 27	Compare

Methods

Add Remove

Concatenated Skip Connection • Convolution • Max Pooling • ReLU • U-Net

Edit Social Preview

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove