Audio Source Separation

44 papers with code • 2 benchmarks • 14 datasets

Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).

Source: Model selection for deep audio source separation via clustering analysis

Latest papers with no code

Gull: A Generative Multifunctional Audio Codec

no code yet • 7 Apr 2024

We introduce Gull, a generative multifunctional audio codec.

Mixture of Dynamical Variational Autoencoders for Multi-Source Trajectory Modeling and Separation

no code yet • 7 Dec 2023

In this paper, we propose a latent-variable generative model called mixture of dynamical variational autoencoders (MixDVAE) to model the dynamics of a system composed of multiple moving sources.

GASS: Generalizing Audio Source Separation with Large-scale Data

no code yet • 29 Sep 2023

Here, we study a single general audio source separation (GASS) model trained to separate speech, music, and sound events in a supervised fashion with a large-scale dataset.

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

no code yet • CVPR 2023

We propose a self-supervised approach for learning to perform audio source separation in videos based on natural language queries, using only unlabeled video and audio pairs as training data.

Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation

no code yet • 25 Jan 2023

Applying a diffusion model Vocoder that was pretrained to model single-speaker voices on the output of a deterministic separation model leads to state-of-the-art separation results.

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks

no code yet • 14 Dec 2022

In this paper, we focus on the cocktail fork problem, which takes a three-pronged approach to source separation by separating an audio mixture such as a movie soundtrack or podcast into the three broad categories of speech, music, and sound effects (SFX - understood to include ambient noise and natural sound events).

Hyperbolic Audio Source Separation

no code yet • 9 Dec 2022

We introduce a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features.

Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation

no code yet • 28 Nov 2022

This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS).

Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation

no code yet • 29 Oct 2022

In this paper, we propose to use this connection between audio and visual dynamics for solving two challenging tasks simultaneously, namely: (i) separating audio sources from a mixture using visual cues, and (ii) predicting the 3D visual motion of a sounding source using its separated audio.

Hierarchic Temporal Convolutional Network With Cross-Domain Encoder for Music Source Separation

no code yet • IEEE Signal Processing Letters 2022

In this paper, we propose a model which combines the complexed spectrogram domain feature and time-domain feature by a cross-domain encoder (CDE) and adopts the hierarchic temporal convolutional network (HTCN) for multiple music sources separation.