Speaker Verification

170 papers with code • 5 benchmarks • 6 datasets

Speaker verification is the verifying the identity of a person from characteristics of the voice.

( Image credit: Contrastive-Predictive-Coding-PyTorch )

Libraries

Use these libraries to find Speaker Verification models and implementations

3D-Speaker-Toolkit: An Open Source Toolkit for Multi-modal Speaker Verification and Diarization

alibaba-damo-academy/3D-Speaker 29 Mar 2024

This paper introduces 3D-Speaker-Toolkit, an open source toolkit for multi-modal speaker verification and diarization.

691
29 Mar 2024

a-DCF: an architecture agnostic metric with application to spoofing-robust speaker verification

shimhz/a_dcf 3 Mar 2024

Spoofing detection is today a mainstream research topic.

3
03 Mar 2024

ChildAugment: Data Augmentation Methods for Zero-Resource Children's Speaker Verification

vpspeech/ChildAugment 23 Feb 2024

One promising approach is to align vocal-tract parameters between adults and children through children-specific data augmentation, referred here to as ChildAugment.

3
23 Feb 2024

ESPnet-SPK: full pipeline speaker embedding toolkit with reproducible recipes, self-supervised front-ends, and off-the-shelf models

espnet/espnet 30 Jan 2024

First, we provide an open-source platform for researchers in the speaker recognition community to effortlessly build models.

7,864
30 Jan 2024

NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification

dmlguq456/next_tdnn_asv 14 Dec 2023

Meanwhile, in vision tasks, ConvNet structures have been modernized by referring to Transformer, resulting in improved performance.

37
14 Dec 2023

Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification

wenet-e2e/wespeaker 6 Dec 2023

We represent the stride space on a trellis diagram, and conduct a systematic study on the impact of temporal and frequency resolutions on the performance and further identify two optimal points, namely Golden Gemini, which serves as a guiding principle for designing 2D ResNet-based speaker verification models.

524
06 Dec 2023

Learning Repeatable Speech Embeddings Using An Intra-class Correlation Regularizer

vigor-jzhang/icc-regularizer NeurIPS 2023

A good supervised embedding for a specific machine learning task is only sensitive to changes in the label of interest and is invariant to other confounding factors.

5
25 Oct 2023

SALMONN: Towards Generic Hearing Abilities for Large Language Models

bytedance/salmonn 20 Oct 2023

Hearing is arguably an essential ability of artificial intelligence (AI) agents in the physical world, which refers to the perception and understanding of general auditory information consisting of at least three types of sounds: speech, audio events, and music.

788
20 Oct 2023

Pairwise Similarity Learning is SimPLE

ydwen/opensphere ICCV 2023

In this paper, we focus on a general yet important learning problem, pairwise similarity learning (PSL).

251
13 Oct 2023