Music Source Separation
53 papers with code • 3 benchmarks • 7 datasets
Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.
( Image credit: SigSep )
Libraries
Use these libraries to find Music Source Separation models and implementationsMost implemented papers
Open-Unmix - A Reference Implementation for Music Source Separation
Music source separation is the task of decomposing music into its constitutive components, e. g., yielding separated stems for the vocals, bass, and drums.
Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation
Convolutional Neural Network (CNN) or Long short-term memory (LSTM) based models with the input of spectrogram or waveforms are commonly used for deep learning based audio source separation.
Music Source Separation in the Waveform Domain
Source separation for music is the task of isolating contributions, or stems, from different instruments recorded individually and arranged together to form a song.
Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform
With this belief, focusing on the fact that the DWT has an anti-aliasing filter and the perfect reconstruction property, we design the proposed layers.
Meta-learning Extractors for Music Source Separation
We propose a hierarchical meta-learning-inspired model for music source separation (Meta-TasNet) in which a generator model is used to predict the weights of individual extractor models.
Unsupervised Interpretable Representation Learning for Singing Voice Separation
In this work, we present a method for learning interpretable music signal representations directly from waveform signals.
Solos: A Dataset for Audio-Visual Music Analysis
In this paper, we present a new dataset of music performance videos which can be used for training machine learning methods for multiple tasks such as audio-visual blind source separation and localization, cross-modal correspondences, cross-modal generation and, in general, any audio-visual selfsupervised task.
Mixing-Specific Data Augmentation Techniques for Improved Blind Violin/Piano Source Separation
Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities.
D3Net: Densely connected multidilated DenseNet for music source separation
In this paper, we claim the importance of a rapid growth of a receptive field and a simultaneous modeling of multi-resolution data in a single convolution layer, and propose a novel CNN architecture called densely connected dilated DenseNet (D3Net).
LaSAFT: Latent Source Attentive Frequency Transformation for Conditioned Source Separation
Recent deep-learning approaches have shown that Frequency Transformation (FT) blocks can significantly improve spectrogram-based single-source separation models by capturing frequency patterns.