Music Source Separation
53 papers with code • 3 benchmarks • 7 datasets
Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.
( Image credit: SigSep )
Libraries
Use these libraries to find Music Source Separation models and implementationsLatest papers
An Efficient Short-Time Discrete Cosine Transform and Attentive MultiResUNet Framework for Music Source Separation
The proposed network is used for the first time in source separation and is more computationally efficient than state-of-the-art separation networks and features favourable performance compared to the state-of-the-art with a fraction of the computational cost.
Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects
We propose an end-to-end music mixing style transfer system that converts the mixing style of an input multitrack to that of a reference song.
Music Source Separation with Band-split RNN
The performance of music source separation (MSS) models has been greatly improved in recent years thanks to the development of novel neural network architectures and training pipelines.
Music Source Separation with Generative Flow
Fully-supervised models for source separation are trained on parallel mixture-source data and are currently state-of-the-art.
Low Latency Time Domain Multichannel Speech and Music Source Separation
The Goal is to obtain a simple multichannel source separation with very low latency.
VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices
Finally, we use the frozen visual features learned by our lip synchronisation model in the singing voice separation task to outperform a baseline audio-visual model which was trained end-to-end.
Unsupervised Music Source Separation Using Differentiable Parametric Source Models
Integrating domain knowledge in the form of source models into a data-driven method leads to high data efficiency: the proposed approach achieves good separation quality even when trained on less than three minutes of audio.
CWS-PResUNet: Music Source Separation with Channel-wise Subband Phase-aware ResUNet
On the MUSDB18HQ test set, we propose a 276-layer CWS-PResUNet and achieve state-of-the-art (SoTA) performance on vocals with an 8. 92 signal-to-distortion ratio (SDR) score.
Danna-Sep: Unite to separate them all
Deep learning-based music source separation has gained a lot of interest in the last decades.
Transfer Learning with Jukebox for Music Source Separation
In this work, we demonstrate how a publicly available, pre-trained Jukebox model can be adapted for the problem of audio source separation from a single mixed audio channel.