Audio Source Separation
44 papers with code • 2 benchmarks • 14 datasets
Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).
Source: Model selection for deep audio source separation via clustering analysis
Latest papers
Unsupervised Source Separation via Bayesian Inference in the Latent Domain
State of the art audio source separation models rely on supervised data-driven approaches, which can be expensive in terms of labeling resources.
Multi-Task Audio Source Separation
In detail, the proposed model follows a two-stage pipeline, which separates the three types of audio signals and then performs signal compensation separately.
Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks
In this paper, we claim the importance of a dense simultaneous modeling of multiresolution representation and propose a novel CNN architecture called densely connected multidilated DenseNet (D3Net).
Parallel and Flexible Sampling from Autoregressive Models via Langevin Dynamics
This paper introduces an alternative approach to sampling from autoregressive models.
Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method
Audio source separation is often used as preprocessing of various applications, and one of its ultimate goals is to construct a single versatile model capable of dealing with the varieties of audio signals.
Differentiable Model Compression via Pseudo Quantization Noise
DiffQ is differentiable both with respect to the unquantized weights and the number of bits used.
Compute and memory efficient universal sound source separation
Recent progress in audio source separation lead by deep learning has enabled many neural network models to provide robust solutions to this fundamental estimation problem.
Music source separation conditioned on 3D point clouds
This paper proposes a multi-modal deep learning model to perform music source separation conditioned on 3D point clouds of music performance recordings.
Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures
In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix.
Densely connected multidilated convolutional networks for dense prediction tasks
In this paper, we claim the importance of a dense simultaneous modeling of multiresolution representation and propose a novel CNN architecture called densely connected multidilated DenseNet (D3Net).