Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Identification

Dataset	Best Model	Compare
VoxCeleb1	MSM-MAE	See all
EVI en-GB	Fuzzy Retrieval	See all
EVI pl-PL	Fuzzy Retrieval	See all
EVI fr-FR	Fuzzy Retrieval	See all

Datasets

Latest papers

Most implemented Social Latest No code

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input

nttcslab/m2d • • 26 Oct 2022

We propose a new method, Masked Modeling Duo (M2D), that learns representations directly while obtaining training signals using only masked patches.

26 Oct 2022

Paper
Code

Cross-Lingual Speaker Identification Using Distant Supervision

slash0bz/speaker-identification • • 11 Oct 2022

Speaker identification, determining which character said each utterance in literary text, benefits many downstream tasks.

11 Oct 2022

Paper
Code

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

AI4Bharat/indicSUPERB • • 24 Aug 2022

We hope IndicSUPERB contributes to the progress of developing speech language understanding models for Indian languages.

24 Aug 2022

Paper
Code

Masked Autoencoders that Listen

facebookresearch/multimodal • • 13 Jul 2022

Following the Transformer encoder-decoder design in MAE, our Audio-MAE first encodes audio spectrogram patches with a high masking ratio, feeding only the non-masked tokens through encoder layers.

1,302

13 Jul 2022

Paper
Code

Extended U-Net for Speaker Verification in Noisy Environments

wngh1187/exu-net • • 27 Jun 2022

Background noise is a well-known factor that deteriorates the accuracy and reliability of speaker verification (SV) systems by blurring speech intelligibility.

27 Jun 2022

Paper
Code

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddlePaddle/DeepSpeech • • NAACL (ACL) 2022

PaddleSpeech is an open-source all-in-one speech toolkit.

10,176

20 May 2022

Paper
Code

EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification

PolyAI-LDN/evi-paper • Findings (NAACL) 2022

Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services.

28 Apr 2022

Paper
Code

ATST: Audio Representation Learning with Teacher-Student Transformer

Audio-WestlakeU/audiossl • • 26 Apr 2022

Self-supervised learning (SSL) learns knowledge from a large amount of unlabeled data, and then transfers the knowledge to a specific problem with a limited number of labeled data.

26 Apr 2022

Paper
Code

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

mu-y/diarist • • 30 Mar 2022

The proposed speaker embedding, named t-vector, is extracted synchronously with the t-SOT ASR model, enabling joint execution of speaker identification (SID) or speaker diarization (SD) with the multi-talker transcription with low latency.

30 Mar 2022

Paper
Code

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

asappresearch/slue-toolkit • • 19 Nov 2021

Historically these have focused on automatic speech recognition (ASR), speaker identification, or other lower-level tasks.

19 Nov 2021

Paper
Code

Speaker Identification

Benchmarks Add a Result

Datasets

Latest papers

Content

Benchmarks

Add a Result