Speaker Separation

11 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation

no code yet • 14 Feb 2024

We propose mixture to mixture (M2M) training, a weakly-supervised neural speech separation algorithm that leverages close-talk mixtures as a weak supervision for training discriminative models to separate far-field mixtures.

Spatial-Temporal Activity-Informed Diarization and Separation

no code yet • 30 Jan 2024

The global spatial activity functions are computed from the global spatial coherence functions based on frequency-averaged local spatial activity functions.

Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor

no code yet • 23 Jan 2024

We propose a novel speech separation model designed to separate mixtures with an unknown number of speakers.

Single-Microphone Speaker Separation and Voice Activity Detection in Noisy and Reverberant Environments

no code yet • 7 Jan 2024

Speech separation involves extracting an individual speaker's voice from a multi-speaker audio signal.

Binaural multichannel blind speaker separation with a causal low-latency and low-complexity approach

no code yet • 8 Dec 2023

In this paper, we introduce a causal low-latency low-complexity approach for binaural multichannel blind speaker separation in noisy reverberant conditions.

Multi-channel Conversational Speaker Separation via Neural Diarization

no code yet • 15 Nov 2023

To enhance ASR performance in conversational or meeting environments, continuous speaker separation (CSS) is commonly employed.

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

no code yet • 30 Oct 2023

Target speech extraction aims to extract, based on a given conditioning cue, a target speech signal that is corrupted by interfering sources, such as noise or competing speakers.

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

no code yet • NeurIPS 2023

At each training step, we feed an input mixture to a deep neural network (DNN) to produce an intermediate estimate for each speaker, linearly filter the estimates, and optimize a loss so that, at each microphone, the filtered estimates of all the speakers can add up to the mixture to satisfy the above constraint.

Independent Vector Extraction Constrained on Manifold of Half-Length Filters

no code yet • 4 Apr 2023

In this paper, we propose a mixing model for joint blind source extraction where the mixing model parameters are linked across the frequencies.

Multi-Microphone Speaker Separation by Spatial Regions

no code yet • 13 Mar 2023

The network is trained to enforce a fixed mapping of regions to network outputs.