Audio Source Separation
44 papers with code • 2 benchmarks • 14 datasets
Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).
Source: Model selection for deep audio source separation via clustering analysis
Most implemented papers
Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures
In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix.
Unsupervised Music Source Separation Using Differentiable Parametric Source Models
Integrating domain knowledge in the form of source models into a data-driven method leads to high data efficiency: the proposed approach achieves good separation quality even when trained on less than three minutes of audio.
Generalization Challenges for Neural Architectures in Audio Source Separation
Recent work has shown that recurrent neural networks can be trained to separate individual speakers in a sound mixture with high fidelity.
Audio-Visual Scene Analysis with Self-Supervised Multisensory Features
The thud of a bouncing ball, the onset of speech as lips open -- when visual and audio events occur together, it suggests that there might be a common, underlying event that produced both signals.
Sparse Gaussian Process Audio Source Separation Using Spectrum Priors in the Time-Domain
As a result, source separation GP models have been restricted to the analysis of short audio frames.
Audio Source Separation Using Variational Autoencoders and Weak Class Supervision
In this paper, we propose a source separation method that is trained by observing the mixtures and the class labels of the sources present in the mixture without any access to isolated sources.
Training Generative Adversarial Networks from Incomplete Observations using Factorised Discriminators
We apply our method to image generation, image segmentation and audio source separation, and obtain improved performance over a standard GAN when additional incomplete training examples are available.
A Provably Correct and Robust Algorithm for Convolutive Nonnegative Matrix Factorization
We present an algorithm that takes advantage of the NMF model underlying CNMF and exploits existing algorithms for separable NMF to provably find a solution under certain conditions.
Sams-Net: A Sliced Attention-based Neural Network for Music Source Separation
Convolutional Neural Network (CNN) or Long short-term memory (LSTM) based models with the input of spectrogram or waveforms are commonly used for deep learning based audio source separation.
Retrieving Signals in the Frequency Domain with Deep Complex Extractors
Using the Wall Street Journal Dataset, we compare our phase-aware loss to several others that operate both in the time and frequency domains and demonstrate the effectiveness of our proposed signal extraction method and proposed loss.