Audio Source Separation
44 papers with code • 2 benchmarks • 14 datasets
Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).
Source: Model selection for deep audio source separation via clustering analysis
Most implemented papers
J-Net: Randomly weighted U-Net for audio source separation
According to these discoveries, we pose two questions: what is the value of randomly weighted networks in difficult generative audio tasks such as audio source separation and does such positive correlation still exist when it comes to large random networks and their trained counterparts?
Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform
With this belief, focusing on the fact that the DWT has an anti-aliasing filter and the perfect reconstruction property, we design the proposed layers.
Unsupervised Audio Source Separation using Generative Priors
State-of-the-art under-determined audio source separation systems rely on supervised end-end training of carefully tailored neural network architectures operating either in the time or the spectral domain.
Solos: A Dataset for Audio-Visual Music Analysis
In this paper, we present a new dataset of music performance videos which can be used for training machine learning methods for multiple tasks such as audio-visual blind source separation and localization, cross-modal correspondences, cross-modal generation and, in general, any audio-visual selfsupervised task.
OtoWorld: Towards Learning to Separate by Learning to Move
The agent receives a reward for turning off a source.
AutoClip: Adaptive Gradient Clipping for Source Separation Networks
Clipping the gradient is a known approach to improving gradient descent, but requires hand selection of a clipping threshold hyperparameter.
The Cone of Silence: Speech Separation by Localization
Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers.
Unified Gradient Reweighting for Model Biasing with Applications to Source Separation
In this paper, we propose a simple, unified gradient reweighting scheme, with a lightweight modification to bias the learning process of a model and steer it towards a certain distribution of results.
Densely connected multidilated convolutional networks for dense prediction tasks
In this paper, we claim the importance of a dense simultaneous modeling of multiresolution representation and propose a novel CNN architecture called densely connected multidilated DenseNet (D3Net).
Music source separation conditioned on 3D point clouds
This paper proposes a multi-modal deep learning model to perform music source separation conditioned on 3D point clouds of music performance recordings.