Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Identification

Dataset	Best Model	Compare
VoxCeleb1	MSM-MAE	See all
EVI en-GB	Fuzzy Retrieval	See all
EVI pl-PL	Fuzzy Retrieval	See all
EVI fr-FR	Fuzzy Retrieval	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

Speaker Recognition from Raw Waveform with SincNet

mravanelli/SincNet • • 29 Jul 2018

Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants.

Paper
Code

Deep Speaker: an End-to-End Neural Speaker Embedding System

philipperemy/deep-speaker • • 5 May 2017

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.

Paper
Code

ATST: Audio Representation Learning with Teacher-Student Transformer

Audio-WestlakeU/audiossl • • 26 Apr 2022

Self-supervised learning (SSL) learns knowledge from a large amount of unlabeled data, and then transfers the knowledge to a specific problem with a limited number of labeled data.

Paper
Code

Masked Autoencoders that Listen

facebookresearch/audiomae • • 13 Jul 2022

Following the Transformer encoder-decoder design in MAE, our Audio-MAE first encodes audio spectrogram patches with a high masking ratio, feeding only the non-masked tokens through encoder layers.

Paper
Code

AM-MobileNet1D: A Portable Model for Speaker Recognition

joaoantoniocn/AM-MobileNet1D • • 31 Mar 2020

To address this demand, we propose a portable model called Additive Margin MobileNet1D (AM-MobileNet1D) to Speaker Identification on mobile devices.

Paper
Code

AutoSpeech: Neural Architecture Search for Speaker Recognition

TAMU-VITA/AutoSpeech • • 7 May 2020

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.

Paper
Code

Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation

andi611/Self-Supervised-Speech-Pretraining-and-Representation-Learning • • 18 May 2020

We use the representations with two downstream tasks, speaker identification, and phoneme classification.

Paper
Code

UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training

microsoft/unispeech • • 12 Oct 2021

We integrate the proposed methods into the HuBERT framework.

Paper
Code

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

microsoft/speecht5 • • ACL 2022

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning.

Paper
Code

Learning Speaker Representations with Mutual Information

Js-Mim/rl_singing_voice • • 1 Dec 2018

Mutual Information (MI) or similar measures of statistical dependence are promising tools for learning these representations in an unsupervised way.

Paper
Code

Speaker Identification

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result