Speaker Verification

171 papers with code • 5 benchmarks • 6 datasets

Speaker verification is the verifying the identity of a person from characteristics of the voice.

( Image credit: Contrastive-Predictive-Coding-PyTorch )

Libraries

Use these libraries to find Speaker Verification models and implementations

Most implemented papers

SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification

wiebket/sveva-fair 26 Jul 2021

Despite the success of deep neural networks (DNNs) in enabling on-device voice assistants, increasing evidence of bias and discrimination in machine learning is raising the urgency of investigating the fairness of these systems.

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

NVIDIA/NeMo 8 Oct 2021

In this paper, we propose TitaNet, a novel neural network architecture for extracting speaker representations.

Pushing the limits of raw waveform speaker recognition

clovaai/voxceleb_trainer 16 Mar 2022

Our best model achieves an equal error rate of 0. 89%, which is competitive with the state-of-the-art models based on handcrafted features, and outperforms the best model based on raw waveform inputs by a large margin.

A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

PaddlePaddle/PaddleSpeech 18 Mar 2022

Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation.

ConvNeXt Based Neural Network for Audio Anti-Spoofing

MS-Mind/MS-Code-02 14 Sep 2022

With the rapid development of speech conversion and speech synthesis algorithms, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks.

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds

ubenwa/cryceleb2023 1 May 2023

This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is a public speaker verification challenge based on cry sounds.

An Enhanced Res2Net with Local and Global Feature Fusion for Speaker Verification

alibaba-damo-academy/3D-Speaker 22 May 2023

This paper proposes a novel architecture called Enhanced Res2Net (ERes2Net), which incorporates both local and global feature fusion techniques to improve the performance.

Pairwise Similarity Learning is SimPLE

ydwen/opensphere ICCV 2023

In this paper, we focus on a general yet important learning problem, pairwise similarity learning (PSL).

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

espnet/espnet 30 Jan 2024

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.