Speaker Verification
171 papers with code • 5 benchmarks • 6 datasets
Speaker verification is the verifying the identity of a person from characteristics of the voice.
( Image credit: Contrastive-Predictive-Coding-PyTorch )
Libraries
Use these libraries to find Speaker Verification models and implementationsLatest papers
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
Audio anti-spoofing for automatic speaker verification aims to safeguard users' identities from spoofing attacks.
Towards single integrated spoofing-aware speaker verification embeddings
Second, competitive performance should be demonstrated compared to the fusion of automatic speaker verification (ASV) and countermeasure (CM) embeddings, which outperformed single embedding solutions by a large margin in the SASV2022 challenge.
One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification
This paper suggests One-Step Knowledge Distillation and Fine-Tuning (OS-KDFT), which incorporates KD and fine-tuning (FT).
An Enhanced Res2Net with Local and Global Feature Fusion for Speaker Verification
This paper proposes a novel architecture called Enhanced Res2Net (ERes2Net), which incorporates both local and global feature fusion techniques to improve the performance.
ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention
In this paper, we propose ACA-Net, a lightweight, global context-aware speaker embedding extractor for Speaker Verification (SV) that improves upon existing work by using Asymmetric Cross Attention (ACA) to replace temporal pooling.
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds
This paper describes the Ubenwa CryCeleb dataset - a labeled collection of infant cries - and the accompanying CryCeleb 2023 task, which is a public speaker verification challenge based on cry sounds.
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker Verification
To effectively leverage the long-term dependencies of audio signals and constrain model complexity, we introduce a novel module called Global-aware Filter layer (GF layer) in this work, which employs a set of learnable transform-domain filters between a 1D discrete Fourier transform and its inverse transform to capture global context.
Can spoofing countermeasure and speaker verification systems be jointly optimised?
Spoofing countermeasure (CM) and automatic speaker verification (ASV) sub-systems can be used in tandem with a backend classifier as a solution to the spoofing aware speaker verification (SASV) task.
Cross-modal Audio-visual Co-learning for Text-independent Speaker Verification
Visual speech (i. e., lip motion) is highly related to auditory speech due to the co-occurrence and synchronization in speech production.
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
This paper summarises the findings from the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC-22), which was held in conjunction with INTERSPEECH 2022.