Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Speaker Identification

Dataset	Best Model	Compare
VoxCeleb1	MSM-MAE	See all
EVI en-GB	Fuzzy Retrieval	See all
EVI pl-PL	Fuzzy Retrieval	See all
EVI fr-FR	Fuzzy Retrieval	See all

Datasets

Latest papers with no code

Most implemented Social Latest No code

Voxceleb-ESP: preliminary experiments detecting Spanish celebrities from their voices

no code yet • 20 Dec 2023

This paper presents VoxCeleb-ESP, a collection of pointers and timestamps to YouTube videos facilitating the creation of a novel speaker recognition dataset.

Paper
Add Code

Efficiency-oriented approaches for self-supervised speech representation learning

no code yet • 18 Dec 2023

Self-supervised learning enables the training of large neural models without the need for large, labeled datasets.

Paper
Add Code

Privacy-preserving Representation Learning for Speech Understanding

no code yet • 26 Oct 2023

In this paper, we present a novel framework to anonymize utterance-level speech embeddings generated by pre-trained encoders and show its effectiveness for a range of speech classification tasks.

Paper
Add Code

Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition

no code yet • 17 Oct 2023

In this study, embeddings from advanced pre-trained language identification (LID) and speaker identification (SID) models are leveraged to improve the accuracy of accent classification and non-native accentedness assessment.

Paper
Add Code

End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis

no code yet • 16 Oct 2023

We present an end-to-end multichannel speaker-attributed automatic speech recognition (MC-SA-ASR) system that combines a Conformer-based encoder with multi-frame crosschannel attention and a speaker-attributed Transformer-based decoder.

Paper
Add Code

Test-Time Training for Speech

no code yet • 19 Sep 2023

In this paper, we study the application of Test-Time Training (TTT) as a solution to handling distribution shifts in speech applications.

Paper
Add Code

Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks

no code yet • 18 Sep 2023

Brain-inspired spiking neural networks (SNNs) have demonstrated great potential for temporal signal processing.

Paper
Add Code

Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

no code yet • 7 Sep 2023

This study provides an empirical analysis of Barlow Twins (BT), an SSL technique inspired by theories of redundancy reduction in human perception.

Paper
Add Code

Read, Look or Listen? What's Needed for Solving a Multimodal Dataset

no code yet • 6 Jul 2023

We propose a two-step method to analyze multimodal datasets, which leverages a small seed of human annotation to map each multimodal instance to the modalities required to process it.

Paper
Add Code

VoxWatch: An open-set speaker recognition benchmark on VoxCeleb

no code yet • 30 Jun 2023

Prior studies on this problem are sparse, and lack a common benchmark for systematic evaluations.

Paper
Add Code

Speaker Identification

Benchmarks Add a Result

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result