Speaker Verification
170 papers with code • 5 benchmarks • 6 datasets
Speaker verification is the verifying the identity of a person from characteristics of the voice.
( Image credit: Contrastive-Predictive-Coding-PyTorch )
Libraries
Use these libraries to find Speaker Verification models and implementationsLatest papers
Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
It is widely acknowledged that discriminative representation for speaker verification can be extracted from verbal speech.
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models
Diff-SV unifies a DPM-based speech enhancement system with a speaker embedding extractor, and yields a discriminative and noise-tolerable speaker representation through a hierarchical structure.
RoDia: A New Dataset for Romanian Dialect Identification from Speech
We introduce RoDia, the first dataset for Romanian dialect identification from speech.
Self-Distillation Network with Ensemble Prototypes: Learning Robust Speaker Representations without Supervision
It assigns representation of augmented views of utterances to the same prototypes as the representation of the original view, thereby enabling effective knowledge transfer between the views.
PAS: Partial Additive Speech Data Augmentation Method for Noise Robust Speaker Verification
In this paper, we propose a new additive noise method, partial additive speech (PAS), which aims to train SV systems to be less affected by noisy environments.
Exploring Binary Classification Loss For Speaker Verification
The mismatch between close-set training and open-set testing usually leads to significant performance degradation for speaker verification task.
Disentanglement in a GAN for Unconditional Speech Synthesis
We confirm that ASGAN's latent space is disentangled: we demonstrate how simple linear operations in the space can be used to perform several tasks unseen during training.
Long-term Conversation Analysis: Exploring Utility and Privacy
The analysis of conversations recorded in everyday life requires privacy protection.
Evaluation of Speech Representations for MOS prediction
Among the supervised and self-supervised learning models using BRSpeechMOS, Whisper-Small achieved the best linear correlation of 0. 6980, and the speaker verification model, SpeakerNet, had linear correlation of 0. 6963.
Malafide: a novel adversarial convolutive noise attack against deepfake and spoofing detection systems
We present Malafide, a universal adversarial attack against automatic speaker verification (ASV) spoofing countermeasures (CMs).