Music Source Separation
53 papers with code • 3 benchmarks • 7 datasets
Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.
( Image credit: SigSep )
Libraries
Use these libraries to find Music Source Separation models and implementationsLatest papers
A fully differentiable model for unsupervised singing voice separation
A novel model was recently proposed by Schulze-Forster et al. in [1] for unsupervised music source separation.
Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models
Our results indicate three key findings: (1) using generative compression, it is feasible to leverage highly compressed data while incurring a negligible impact on machine perceptual quality; (2) machine perceptual quality correlates strongly with deep similarity metrics, indicating a crucial role of these metrics in the development of machine-oriented codecs; and (3) using lossy compressed datasets, (e. g. ImageNet) for pre-training can lead to counter-intuitive scenarios where lossy compression increases machine perceptual quality rather than degrading it.
Pre-training Music Classification Models via Music Source Separation
In this paper, we study whether music source separation can be used as a pre-training strategy for music representation learning, targeted at music classification tasks.
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' and 'other' tracks from a piece of mixed music.
The Sound Demixing Challenge 2023 $\unicode{x2013}$ Music Demixing Track
We propose a formalization of the errors that can occur in the design of a training dataset for MSS systems and introduce two new datasets that simulate such errors: SDXDB23_LabelNoise and SDXDB23_Bleeding.
Sound Demixing Challenge 2023 Music Demixing Track Technical Report: TFC-TDF-UNet v3
In this report, we present our award-winning solutions for the Music Demixing Track of Sound Demixing Challenge 2023.
Quantifying Spatial Audio Quality Impairment
Spatial audio quality is a highly multifaceted concept, with many interactions between environmental, geometrical, anatomical, psychological, and contextual considerations.
The Whole Is Greater than the Sum of Its Parts: Improving DNN-based Music Source Separation
We modify the target network, i. e., the network architecture of the original DNN-based MSS, by adding bridging paths for each output instrument to share their information.
Hybrid Transformers for Music Source Separation
While it performs poorly when trained only on MUSDB, we show that it outperforms Hybrid Demucs (trained on the same data) by 0. 45 dB of SDR when using 800 extra training songs.
MedleyVox: An Evaluation Dataset for Multiple Singing Voices Separation
Second, to overcome the absence of existing multi-singing datasets for a training purpose, we present a strategy for construction of multiple singing mixtures using various single-singing datasets.