Audio Source Separation

44 papers with code • 2 benchmarks • 14 datasets

Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).

Source: Model selection for deep audio source separation via clustering analysis

Most implemented papers

J-Net: Randomly weighted U-Net for audio source separation

EdwinYam/J-Net 29 Nov 2019

According to these discoveries, we pose two questions: what is the value of randomly weighted networks in difficult generative audio tasks such as audio source separation and does such positive correlation still exist when it comes to large random networks and their trained counterparts?

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform

TomohikoNakamura/dwtls 28 Jan 2020

With this belief, focusing on the fact that the DWT has an anti-aliasing filter and the perfect reconstruction property, we design the proposed layers.

Unsupervised Audio Source Separation using Generative Priors

vivsivaraman/sourcesepganprior 28 May 2020

State-of-the-art under-determined audio source separation systems rely on supervised end-end training of carefully tailored neural network architectures operating either in the time or the spectral domain.

Solos: A Dataset for Audio-Visual Music Analysis

JuanFMontesinos/Solos 14 Jun 2020

In this paper, we present a new dataset of music performance videos which can be used for training machine learning methods for multiple tasks such as audio-visual blind source separation and localization, cross-modal correspondences, cross-modal generation and, in general, any audio-visual selfsupervised task.

OtoWorld: Towards Learning to Separate by Learning to Move

pseeth/otoworld 12 Jul 2020

The agent receives a reward for turning off a source.

AutoClip: Adaptive Gradient Clipping for Source Separation Networks

pseeth/autoclip 25 Jul 2020

Clipping the gradient is a known approach to improving gradient descent, but requires hand selection of a clipping threshold hyperparameter.

The Cone of Silence: Speech Separation by Localization

vivjay30/Cone-of-Silence NeurIPS 2020

Given a multi-microphone recording of an unknown number of speakers talking concurrently, we simultaneously localize the sources and separate the individual speakers.

Unified Gradient Reweighting for Model Biasing with Applications to Source Separation

etzinis/biased_separation 25 Oct 2020

In this paper, we propose a simple, unified gradient reweighting scheme, with a lightweight modification to bias the learning process of a model and steer it towards a certain distribution of results.

Densely connected multidilated convolutional networks for dense prediction tasks

sony/ai-research-code 21 Nov 2020

In this paper, we claim the importance of a dense simultaneous modeling of multiresolution representation and propose a novel CNN architecture called densely connected multidilated DenseNet (D3Net).

Music source separation conditioned on 3D point clouds

francesclluis/point-cloud-source-separation 3 Feb 2021

This paper proposes a multi-modal deep learning model to perform music source separation conditioned on 3D point clouds of music performance recordings.