TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	EXTRA DATA	REMOVE
Speaker Verification	VoxCeleb	WavLM+ECAPA-TDNN	EER	0.39	# 1
Speaker Recognition	VoxCeleb1	WavLM+ECAPA-TDNN	EER	0.39	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/espnet-spk-full-pipeline-speaker-embedding/speaker-verification-on-voxceleb)](https://paperswithcode.com/sota/speaker-verification-on-voxceleb?p=espnet-spk-full-pipeline-speaker-embedding)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/espnet-spk-full-pipeline-speaker-embedding/speaker-recognition-on-voxceleb1)](https://paperswithcode.com/sota/speaker-recognition-on-voxceleb1?p=espnet-spk-full-pipeline-speaker-embedding)`

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

30 Jan 2024 · Jee-weon Jung, Wangyou Zhang, Jiatong Shi, Zakaria Aldeneh, Takuya Higuchi, Barry-John Theobald, Ahmed Hussen Abdelaziz, Shinji Watanabe ·

This paper introduces ESPnet-SPK, a toolkit designed with several objectives for training speaker embedding extractors. First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models. We provide several models, ranging from x-vector to recent SKA-TDNN. Through the modularized architecture design, variants can be developed easily. We also aspire to bridge developed models with other domains, facilitating the broad research community to effortlessly incorporate state-of-the-art embedding extractors. Pre-trained embedding extractors can be accessed in an off-the-shelf manner and we demonstrate the toolkit's versatility by showcasing its integration with two tasks. Another goal is to integrate with diverse self-supervised learning features. We release a reproducible recipe that achieves an equal error rate of 0.39% on the Vox1-O evaluation protocol using WavLM-Large with ECAPA-TDNN.

PDF Abstract