Speaker Verification

170 papers with code • 5 benchmarks • 6 datasets

Speaker verification is the verifying the identity of a person from characteristics of the voice.

( Image credit: Contrastive-Predictive-Coding-PyTorch )

Libraries

Use these libraries to find Speaker Verification models and implementations

Most implemented papers

End-to-End Text-Dependent Speaker Verification

Janghyun1230/Speaker_Verification 27 Sep 2015

In this paper we present a data-driven, integrated approach to speaker verification, which maps a test utterance and a few reference utterances directly to a single score for verification and jointly optimizes the system's components using the same evaluation protocol and metric as at test time.

Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data

wnhsu/FactorizedHierarchicalVAE NeurIPS 2017

We present a factorized hierarchical variational autoencoder, which learns disentangled and interpretable representations from sequential data without supervision.

rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method

zhenghuatan/rVAD 9 Jun 2019

In the end, a posteriori SNR weighted energy difference is applied to the extended pitch segments of the denoised speech signal for detecting voice activity.

Ludwig: a type-based declarative deep learning toolbox

uber/ludwig 17 Sep 2019

In this work we present Ludwig, a flexible, extensible and easy to use toolbox which allows users to train deep learning models and use them for obtaining predictions without writing code.

AutoSpeech: Neural Architecture Search for Speaker Recognition

TAMU-VITA/AutoSpeech 7 May 2020

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.

One-class learning towards generalized voice spoofing detection

yzyouzhang/AIR-ASVspoof 27 Oct 2020

Human voices can be used to authenticate the identity of the speaker, but the automatic speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as impersonation, replay, text-to-speech, and voice conversion.

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

yzyouzhang/Empirical-Channel-CM 3 Apr 2021

Spoofing countermeasure (CM) systems are critical in speaker verification; they aim to discern spoofing attacks from bona fide speech trials.

3D Convolutional Neural Networks for Cross Audio-Visual Matching Recognition

astorfi/lip-reading-deeplearning 18 Jun 2017

We propose the use of a coupled 3D Convolutional Neural Network (3D-CNN) architecture that can map both modalities into a representation space to evaluate the correspondence of audio-visual streams using the learned multimodal features.

Attention-Based Models for Text-Dependent Speaker Verification

liyongze/lstm_speaker_verification 28 Oct 2017

Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence.

Scalable Factorized Hierarchical Variational Autoencoder Training

wnhsu/ScalableFHVAE 9 Apr 2018

Deep generative models have achieved great success in unsupervised learning with the ability to capture complex nonlinear relationships between latent generating factors and observations.