TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Speech Separation	WHAMR!	TD-Confomer (S)	SI-SDRi	10.5	# 13
Speech Separation	WHAMR!	TD-Confomer (M) + DM	SI-SDRi	12	# 11
Speech Separation	WHAMR!	TD-Conformer (L) + DM	SI-SDRi	13.4	# 5
Speech Separation	WHAMR!	TD-Conformer (XL) + DM	SI-SDRi	14.6	# 3
Speech Separation	WSJ0-2mix	TD-Conformer (XL) + DM	SI-SDRi	21.2	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/on-time-domain-conformer-models-for-monaural/speech-separation-on-whamr)](https://paperswithcode.com/sota/speech-separation-on-whamr?p=on-time-domain-conformer-models-for-monaural)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/on-time-domain-conformer-models-for-monaural/speech-separation-on-wsj0-2mix)](https://paperswithcode.com/sota/speech-separation-on-wsj0-2mix?p=on-time-domain-conformer-models-for-monaural)`

On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments

9 Oct 2023 · William Ravenscroft, Stefan Goetze, Thomas Hain ·

Speech separation remains an important topic for multi-speaker technology researchers. Convolution augmented transformers (conformers) have performed well for many speech processing tasks but have been under-researched for speech separation. Most recent state-of-the-art (SOTA) separation models have been time-domain audio separation networks (TasNets). A number of successful models have made use of dual-path (DP) networks which sequentially process local and global information. Time domain conformers (TD-Conformers) are an analogue of the DP approach in that they also process local and global context sequentially but have a different time complexity function. It is shown that for realistic shorter signal lengths, conformers are more efficient when controlling for feature dimension. Subsampling layers are proposed to further improve computational efficiency. The best TD-Conformer achieves 14.6 dB and 21.2 dB SISDR improvement on the WHAMR and WSJ0-2Mix benchmarks, respectively.

PDF Abstract

Code

Add Remove Mark official

jwr1995/pubsep official

Tasks

Add Remove

Computational Efficiency

Speech Separation

Datasets

WSJ0-2mix WHAM! WHAMR!

Results from the Paper

Edit

Ranked #3 on Speech Separation on WHAMR!

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Speech Separation	WHAMR!	TD-Confomer (S)	SI-SDRi	10.5	# 13	Compare
Speech Separation	WHAMR!	TD-Confomer (M) + DM	SI-SDRi	12	# 11	Compare
Speech Separation	WHAMR!	TD-Conformer (L) + DM	SI-SDRi	13.4	# 5	Compare
Speech Separation	WHAMR!	TD-Conformer (XL) + DM	SI-SDRi	14.6	# 3	Compare
Speech Separation	WSJ0-2mix	TD-Conformer (XL) + DM	SI-SDRi	21.2	# 12	Compare

Methods

Add Remove

Convolution

Edit Social Preview

On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove