Audio Source Separation

44 papers with code • 2 benchmarks • 14 datasets

Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).

Source: Model selection for deep audio source separation via clustering analysis

Latest papers with no code

Sampling Frequency Independent Dialogue Separation

no code yet • 5 Jun 2022

The models are trained with audio sampled at 8 kHz.

SepIt: Approaching a Single Channel Speech Separation Bound

no code yet • 24 May 2022

We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech.

On loss functions and evaluation metrics for music source separation

no code yet • 16 Feb 2022

We investigate which loss functions provide better separations via benchmarking an extensive set of those for music source separation.

Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds

no code yet • 1 Feb 2022

A differentiable digital signal processing (DDSP) autoencoder is a musical sound synthesizer that combines a deep neural network (DNN) and spectral modeling synthesis.

Fish sounds: towards the evaluation of marine acoustic biodiversity through data-driven audio source separation

no code yet • 13 Jan 2022

Moreover, one of the causes of biodiversity loss is sound pollution; in data obtained from regions with loud anthropic noise, it is hard to separate the artificial from the fish sound manually.

Self-Supervised Beat Tracking in Musical Signals with Polyphonic Contrastive Learning

no code yet • 5 Jan 2022

In order to combat this problem, we present a new self-supervised learning pretext task for beat tracking and downbeat estimation.

Zero-shot Audio Source Separation through Query-based Learningfrom Weakly-labeled Data

no code yet • AAAI 2021

Our approach uses a single model for source separation of multiple sound types, and relies solely on weakly-labeled data for training.

Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks

no code yet • 2 Nov 2021

Listening to the audio of TV broadcast signals can be challenging for hearing-impaired as well as normal-hearing listeners, especially when background sounds are prominent or too loud compared to the speech signal.

Visual Scene Graphs for Audio Source Separation

no code yet • ICCV 2021

At its core, AVSGS uses a recursive neural network that emits mutually-orthogonal sub-graph embeddings of the visual graph using multi-head attention.

Move2Hear: Active Audio-Visual Source Separation

no code yet • ICCV 2021

We introduce the active audio-visual source separation problem, where an agent must move intelligently in order to better isolate the sounds coming from an object of interest in its environment.