Speech Separation

97 papers with code • 18 benchmarks • 16 datasets

The task of extracting all overlapping speech sources in a given mixed speech signal refers to the Speech Separation. Speech Separation is a special scenario of source separation problem, where the focus is only on the overlapping speech signal sources and other interferences such as music or noise signals are not the main concern of the study.

Source: A Unified Framework for Speech Separation

Image credit: Speech Separation of A Target Speaker Based on Deep Neural Networks

Libraries

Use these libraries to find Speech Separation models and implementations
10 papers
2,117
3 papers
235
2 papers
7,907
See all 6 libraries.

Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation

yuchen005/unified-enhance-separation 22 Feb 2023

To alleviate this problem, we propose a novel network to unify speech enhancement and separation with gradient modulation to improve noise-robustness.

31
22 Feb 2023

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits

jusperlee/lrs3-for-speech-separation 21 Dec 2022

Then, inspired by the large number of connections between cortical regions and the thalamus, the model fuses the auditory and visual information in a thalamic subnetwork through top-down connections.

73
21 Dec 2022

Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation

jwr1995/dtcn 27 Oct 2022

In this work deformable convolution is proposed as a solution to allow TCN models to have dynamic RFs that can adapt to various reverberation times for reverberant speech separation.

14
27 Oct 2022

CasNet: Investigating Channel Robustness for Speech Separation

sinica-slam/casnet 27 Oct 2022

In this study, inheriting the use of our previously constructed TAT-2mix corpus, we address the channel mismatch problem by proposing a channel-aware audio separation network (CasNet), a deep learning framework for end-to-end time-domain speech separation.

2
27 Oct 2022

OCD: Learning to Overfit with Conditional Diffusion Models

shaharlutatipersonal/ocd 2 Oct 2022

We present a dynamic model in which the weights are conditioned on an input sample x and are learned to match those that would be obtained by finetuning a base model on x and its label y.

16
02 Oct 2022

An efficient encoder-decoder architecture with top-down attention for speech separation

JusperLee/TDANet 30 Sep 2022

In addition, a large-size version of TDANet obtained SOTA results on three datasets, with MACs still only 10\% of Sepformer and the CPU inference time only 24\% of Sepformer.

205
30 Sep 2022

CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement

ruizhecao96/cmgan 22 Sep 2022

Convolution-augmented transformers (Conformers) are recently proposed in various speech-domain applications, such as automatic speech recognition (ASR) and speech separation, as they can capture both local and global dependencies.

256
22 Sep 2022

Analysis of impact of emotions on target speech extraction and speech separation

butspeechfit/ravdess2mix 15 Aug 2022

One of the factors causing such degradation may be intrinsic speaker variability, such as emotions, occurring commonly in realistic speech.

2
15 Aug 2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

espnet/espnet 19 Jul 2022

To showcase such integration, we performed experiments on carefully designed synthetic datasets for noisy-reverberant multi-channel ST and SLU tasks, which can be used as benchmark corpora for future research.

7,907
19 Jul 2022

Resource-Efficient Separation Transformer

speechbrain/speechbrain 19 Jun 2022

Transformers have recently achieved state-of-the-art performance in speech separation.

7,911
19 Jun 2022