Speaker Separation
11 papers with code • 0 benchmarks • 3 datasets
Benchmarks
These leaderboards are used to track progress in Speaker Separation
Latest papers
Blind Speech Separation and Dereverberation using Neural Beamforming
In this paper, we present the Blind Speech Separation and Dereverberation (BSSD) network, which performs simultaneous speaker separation, dereverberation and speaker identification in a single neural network.
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired.
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.
Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss
We have open sourced our re-implementation of the DPRNN-TasNet here (https://github. com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation), and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network.
Neural separation of observed and unobserved distributions
In this work, we introduce a new method---Neural Egg Separation---to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution.
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.
Monaural Audio Speaker Separation with Source Contrastive Estimation
Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference.
Deep attractor network for single-microphone speaker separation
We propose a novel deep learning framework for single channel speech separation by creating attractor points in high dimensional embedding space of the acoustic signals which pull together the time-frequency bins corresponding to each source.