Speaker Recognition

90 papers with code • 1 benchmarks • 6 datasets

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Recognition

Trend	Dataset	Best Model	Paper	Code	Compare
	VoxCeleb1	WavLM+ECAPA-TDNN			See all

Libraries

Use these libraries to find Speaker Recognition models and implementations

s3prl/s3prl

2 papers

2,092

andi611/Self-Supervised-Speech-Pret…

2 papers

2,092

Jungjee/RawNet

2 papers

332

Datasets

Most implemented papers

Most implemented Social Latest No code

Speaker Recognition from Raw Waveform with SincNet

mravanelli/SincNet • • 29 Jul 2018

Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants.

Paper
Code

Deep Speaker: an End-to-End Neural Speaker Embedding System

philipperemy/deep-speaker • • 5 May 2017

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.

Paper
Code

Utterance-level Aggregation For Speaker Recognition In The Wild

WeidiXie/VGG-Speaker-Recognition • • 26 Feb 2019

The objective of this paper is speaker recognition "in the wild"-where utterances may be of variable length and also contain irrelevant signals.

Paper
Code

Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders

andi611/Self-Supervised-Speech-Pretraining-and-Representation-Learning • • 25 Oct 2019

We present Mockingjay as a new speech representation learning approach, where bidirectional Transformer encoders are pre-trained on a large amount of unlabeled speech.

Paper
Code

TERA: Self-Supervised Learning of Transformer Encoder Representation for Speech

s3prl/s3prl • • 12 Jul 2020

We present a large-scale comparison of various self-supervised models.

Paper
Code

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

Edresson/VoiceSplit • • 11 Oct 2018

In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.

Paper
Code

AM-MobileNet1D: A Portable Model for Speaker Recognition

joaoantoniocn/AM-MobileNet1D • • 31 Mar 2020

To address this demand, we propose a portable model called Additive Margin MobileNet1D (AM-MobileNet1D) to Speaker Identification on mobile devices.

Paper
Code

AutoSpeech: Neural Architecture Search for Speaker Recognition

TAMU-VITA/AutoSpeech • • 7 May 2020

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.

Paper
Code

HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE

taoruijie/ecapatdnn • • 12 Nov 2021

This work provides a brief description of Human Language Technology (HLT) Laboratory, National University of Singapore (NUS) system submission for 2020 NIST conversational telephone speech (CTS) speaker recognition evaluation (SRE).

Paper
Code

Probabilistic Spherical Discriminant Analysis: An Alternative to PLDA for length-normalized embeddings

bsxfan/PSDA • 28 Mar 2022

In speaker recognition, where speech segments are mapped to embeddings on the unit hypersphere, two scoring backends are commonly used, namely cosine scoring or PLDA.

Paper
Code

Speaker Recognition

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result