Speaker Identification

61 papers with code • 4 benchmarks • 4 datasets

This task has no description! Would you like to contribute one?

Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input

nttcslab/m2d 26 Oct 2022

We propose a new method, Masked Modeling Duo (M2D), that learns representations directly while obtaining training signals using only masked patches.

39
26 Oct 2022

Cross-Lingual Speaker Identification Using Distant Supervision

slash0bz/speaker-identification 11 Oct 2022

Speaker identification, determining which character said each utterance in literary text, benefits many downstream tasks.

7
11 Oct 2022

IndicSUPERB: A Speech Processing Universal Performance Benchmark for Indian languages

AI4Bharat/indicSUPERB 24 Aug 2022

We hope IndicSUPERB contributes to the progress of developing speech language understanding models for Indian languages.

3
24 Aug 2022

Masked Autoencoders that Listen

facebookresearch/multimodal 13 Jul 2022

Following the Transformer encoder-decoder design in MAE, our Audio-MAE first encodes audio spectrogram patches with a high masking ratio, feeding only the non-masked tokens through encoder layers.

1,302
13 Jul 2022

Extended U-Net for Speaker Verification in Noisy Environments

wngh1187/exu-net 27 Jun 2022

Background noise is a well-known factor that deteriorates the accuracy and reliability of speaker verification (SV) systems by blurring speech intelligibility.

24
27 Jun 2022

PaddleSpeech: An Easy-to-Use All-in-One Speech Toolkit

PaddlePaddle/DeepSpeech NAACL (ACL) 2022

PaddleSpeech is an open-source all-in-one speech toolkit.

10,176
20 May 2022

EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification

PolyAI-LDN/evi-paper Findings (NAACL) 2022

Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services.

2
28 Apr 2022

ATST: Audio Representation Learning with Teacher-Student Transformer

Audio-WestlakeU/audiossl 26 Apr 2022

Self-supervised learning (SSL) learns knowledge from a large amount of unlabeled data, and then transfers the knowledge to a specific problem with a limited number of labeled data.

65
26 Apr 2022

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

mu-y/diarist 30 Mar 2022

The proposed speaker embedding, named t-vector, is extracted synchronously with the t-SOT ASR model, enabling joint execution of speaker identification (SID) or speaker diarization (SD) with the multi-talker transcription with low latency.

15
30 Mar 2022

SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech

asappresearch/slue-toolkit 19 Nov 2021

Historically these have focused on automatic speech recognition (ASR), speaker identification, or other lower-level tasks.

57
19 Nov 2021