Speaker Verification
170 papers with code • 5 benchmarks • 6 datasets
Speaker verification is the verifying the identity of a person from characteristics of the voice.
( Image credit: Contrastive-Predictive-Coding-PyTorch )
Libraries
Use these libraries to find Speaker Verification models and implementationsMost implemented papers
VoxCeleb2: Deep Speaker Recognition
The objective of this paper is speaker recognition under noisy and unconstrained conditions.
Multiobjective Optimization Training of PLDA for Speaker Verification
Most current state-of-the-art text-independent speaker verification systems take probabilistic linear discriminant analysis (PLDA) as their backend classifiers.
DELTA: A DEep learning based Language Technology plAtform
In this paper we present DELTA, a deep learning based language technology platform.
Personal VAD: Speaker-Conditioned Voice Activity Detection
In this paper, we propose "personal VAD", a system to detect the voice activity of a target speaker at the frame level.
Adversarial Attacks on GMM i-vector based Speaker Verification Systems
Experiment results show that GMM i-vector systems are seriously vulnerable to adversarial attacks, and the crafted adversarial samples prove to be transferable and pose threats to neuralnetwork speaker embedding based systems (e. g. x-vector systems).
A Speaker Verification Backend for Improved Calibration Performance across Varying Conditions
In a recent work, we presented a discriminative backend for speaker verification that achieved good out-of-the-box calibration performance on most tested conditions containing varying levels of mismatch to the training conditions.
Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms
Recent advances in deep learning have facilitated the design of speaker verification systems that directly input raw waveforms.
Crossed-Time Delay Neural Network for Speaker Recognition
Time Delay Neural Network (TDNN) is a well-performing structure for DNN-based speaker recognition systems.
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention
Any-to-any voice conversion aims to convert the voice from and to any speakers even unseen during training, which is much more challenging compared to one-to-one or many-to-many tasks, but much more attractive in real-world scenarios.
Channel-wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks
This argument motivates the current work that presents a novel, channel-wise gated Res2Net (CG-Res2Net), which modifies Res2Net to enable a channel-wise gating mechanism in the connection between feature groups.