Speaker Verification
171 papers with code • 5 benchmarks • 6 datasets
Speaker verification is the verifying the identity of a person from characteristics of the voice.
( Image credit: Contrastive-Predictive-Coding-PyTorch )
Libraries
Use these libraries to find Speaker Verification models and implementationsMost implemented papers
SVEva Fair: A Framework for Evaluating Fairness in Speaker Verification
Despite the success of deep neural networks (DNNs) in enabling on-device voice assistants, increasing evidence of bias and discrimination in machine learning is raising the urgency of investigating the fairness of these systems.
TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
In this paper, we propose TitaNet, a novel neural network architecture for extracting speaker representations.
Pushing the limits of raw waveform speaker recognition
Our best model achieves an equal error rate of 0. 89%, which is competitive with the state-of-the-art models based on handcrafted features, and outperforms the best model based on raw waveform inputs by a large margin.
A$^3$T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing
Recently, speech representation learning has improved many speech-related tasks such as speech recognition, speech classification, and speech-to-text translation.
PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit
PaddleSpeech is an open-source all-in-one speech toolkit.
ConvNeXt Based Neural Network for Audio Anti-Spoofing
With the rapid development of speech conversion and speech synthesis algorithms, automatic speaker verification (ASV) systems are vulnerable to spoofing attacks.
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds
This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is a public speaker verification challenge based on cry sounds.
An Enhanced Res2Net with Local and Global Feature Fusion for Speaker Verification
This paper proposes a novel architecture called Enhanced Res2Net (ERes2Net), which incorporates both local and global feature fusion techniques to improve the performance.
Pairwise Similarity Learning is SimPLE
In this paper, we focus on a general yet important learning problem, pairwise similarity learning (PSL).
ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models
First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.