Speaker Recognition

90 papers with code • 1 benchmarks • 6 datasets

Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.

Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition

Libraries

Use these libraries to find Speaker Recognition models and implementations

Latest papers with no code

Phonetic-aware speaker embedding for far-field speaker verification

no code yet • 27 Nov 2023

The intuition is that phonetic information can preserve low-level acoustic dynamics with speaker information and thus partly compensate for the degradation due to noise and reverberation.

Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models

no code yet • 13 Nov 2023

Motivated by recent advancements in voice conversion (VC), we propose to use the one short sentence knowledge to generate more synthetic speech samples that sound like the target speaker, called parrot speech.

Detecting Agreement in Multi-party Conversational AI

no code yet • 6 Nov 2023

Today, conversational systems are expected to handle conversations in multi-party settings, especially within Socially Assistive Robots (SARs).

Personalizing Keyword Spotting with Speaker Information

no code yet • 6 Nov 2023

Keyword spotting systems often struggle to generalize to a diverse population with various accents and age groups.

Deep Neural Networks for Automatic Speaker Recognition Do Not Learn Supra-Segmental Temporal Features

no code yet • 1 Nov 2023

While deep neural networks have shown impressive results in automatic speaker recognition and related tasks, it is dissatisfactory how little is understood about what exactly is responsible for these results.

UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing

no code yet • 25 Oct 2023

2) Multi-Task Capability: Beyond the single-task focus of previous systems, UniX-Encoder acts as a robust upstream model, adeptly extracting features for diverse tasks including ASR and speaker recognition.

Privacy-oriented manipulation of speaker representations

no code yet • 10 Oct 2023

Speaker embeddings are ubiquitous, with applications ranging from speaker recognition and diarization to speech synthesis and voice anonymisation.

Thech. Report: Genuinization of Speech waveform PMF for speaker detection spoofing and countermeasures

no code yet • 9 Oct 2023

In this article, we propose an algorithm, denoted genuinization, capable of reducing the waveform distribution gap between authentic speech and spoofing speech.

Disentangling Voice and Content with Self-Supervision for Speaker Recognition

no code yet • NeurIPS 2023

For speaker recognition, it is difficult to extract an accurate speaker representation from speech because of its mixture of speaker traits and content.

Voice Morphing: Two Identities in One Voice

no code yet • 5 Sep 2023

In a biometric system, each biometric sample or template is typically associated with a single identity.