Browse > Speech > Speech Separation

Speech Separation

25 papers with code · Speech

Leaderboards

Greatest papers with code

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

20 Sep 2018facebookresearch/demucs

The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.

MUSIC SOURCE SEPARATION SPEAKER SEPARATION SPEECH SEPARATION

Deep learning for monaural speech separation

ICASSP 2014 posenhuang/deeplearningsourceseparation

We propose the joint optimization of the deep learning models (deep neural networks and recurrent neural networks) with an extra masking layer, which enforces a reconstruction constraint.

MULTI-SPEAKER SOURCE SEPARATION SPEECH SEPARATION

Filterbank design for end-to-end speech separation

23 Oct 2019mpariente/asteroid

Also, we validate the use of parameterized filterbanks and show that complex-valued representations and masks are beneficial in all conditions.

SPEAKER RECOGNITION SPEECH SEPARATION

Two-Step Sound Source Separation: Training on Learned Latent Targets

22 Oct 2019mpariente/asteroid

In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.

SPEECH SEPARATION

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

14 Oct 2019mpariente/asteroid

Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods.

SPEECH SEPARATION

Real-time Single-channel Dereverberation and Separation with Time-domainAudio Separation Network

ISCA Interspeech 2018 mpariente/asteroid

We investigate the recently proposed Time-domain Audio Sep-aration Network (TasNet) in the task of real-time single-channel speech dereverberation.

DENOISING SPEECH SEPARATION

Alternative Objective Functions for Deep Clustering

ICASSP 2018 mpariente/asteroid

The recently proposed deep clustering framework represents a significant step towards solv-ing the cocktail party problem.

SPEECH SEPARATION

TasNet: time-domain audio separation network for real-time, single-channel speech separation

1 Nov 2017mpariente/asteroid

We directly model the signal in the time-domain using an encoder-decoder framework and perform the source separation on nonnegative encoder outputs.

SPEECH SEPARATION

Deep clustering: Discriminative embeddings for segmentation and separation

18 Aug 2015mpariente/asteroid

The framework can be used without class labels, and therefore has the potential to be trained on a diverse set of sound types, and to generalize to novel sources.

SEMANTIC SEGMENTATION SPEECH SEPARATION

Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks

18 Mar 2017snsun/pit-speech-separation

We evaluated uPIT on the WSJ0 and Danish two- and three-talker mixed-speech separation tasks and found that uPIT outperforms techniques based on Non-negative Matrix Factorization (NMF) and Computational Auditory Scene Analysis (CASA), and compares favorably with Deep Clustering (DPCL) and the Deep Attractor Network (DANet).

SPEECH SEPARATION