Speaker Recognition

90 papers with code • 1 benchmarks • 6 datasets

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Recognition

Trend	Dataset	Best Model	Paper	Code	Compare
	VoxCeleb1	WavLM+ECAPA-TDNN			See all

Libraries

Use these libraries to find Speaker Recognition models and implementations

2 papers

2,092

andi611/Self-Supervised-Speech-Pret…

2 papers

2,092

2 papers

332

Datasets

Latest papers

Most implemented Social Latest No code

3D-Speaker-Toolkit: An Open Source Toolkit for Multi-modal Speaker Verification and Diarization

alibaba-damo-academy/3D-Speaker • • 29 Mar 2024

This paper introduces 3D-Speaker-Toolkit, an open source toolkit for multi-modal speaker verification and diarization.

709

29 Mar 2024

Paper
Code

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

espnet/espnet • • 30 Jan 2024

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.

7,875

30 Jan 2024

Paper
Code

Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews

bandas-center/atrain • • 18 Oct 2023

If an entry-level graphics card is available, the transcription speed increases to 20% of the audio duration.

96

18 Oct 2023

Paper
Code

Leveraging In-the-Wild Data for Effective Self-Supervised Pretraining in Speaker Recognition

wenet-e2e/wespeaker • • 21 Sep 2023

Current speaker recognition systems primarily rely on supervised approaches, constrained by the scale of labeled datasets.

535

21 Sep 2023

Paper
Code

SLMIA-SR: Speaker-Level Membership Inference Attacks against Speaker Recognition Systems

s3l-official/slmia-sr • • 14 Sep 2023

Our attack is versatile and can work in both white-box and black-box scenarios.

4

14 Sep 2023

Paper
Code

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

ashi-ta/speechglue • 14 Jun 2023

Self-supervised learning (SSL) for speech representation has been successfully applied in various downstream tasks, such as speech and speaker recognition.

13

14 Jun 2023

Paper
Code

Can Self-Supervised Neural Representations Pre-Trained on Human Speech distinguish Animal Callers?

idiap/ssl-caller-detection • • 23 May 2023

Self-supervised learning (SSL) models use only the intrinsic structure of a given signal, independent of its acoustic domain, to extract essential information from the input to an embedding space.

6

23 May 2023

Paper
Code

Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios

morganlee123/evector • 13 May 2023

The accuracy of automated speaker recognition is negatively impacted by change in emotions in a person's speech.

2

13 May 2023

Paper
Code

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

jaesunghuh/voxsrc2022 • 20 Feb 2023

This paper summarises the findings from the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22), which was held in conjunction with INTERSPEECH 2022.

17

20 Feb 2023

Paper
Code

Probabilistic Back-ends for Online Speaker Recognition and Clustering

sholokhovalexey/online-speaker-clustering • • 19 Feb 2023

This paper focuses on multi-enrollment speaker recognition which naturally occurs in the task of online speaker clustering, and studies the properties of different scoring back-ends in this scenario.

8

19 Feb 2023

Paper
Code