Speaker Recognition
90 papers with code • 1 benchmarks • 6 datasets
Speaker Recognition is the process of identifying or confirming the identity of a person given his speech segments.
Source: Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition
Libraries
Use these libraries to find Speaker Recognition models and implementationsDatasets
Latest papers with no code
Phonetic-aware speaker embedding for far-field speaker verification
The intuition is that phonetic information can preserve low-level acoustic dynamics with speaker information and thus partly compensate for the degradation due to noise and reverberation.
Parrot-Trained Adversarial Examples: Pushing the Practicality of Black-Box Audio Attacks against Speaker Recognition Models
Motivated by recent advancements in voice conversion (VC), we propose to use the one short sentence knowledge to generate more synthetic speech samples that sound like the target speaker, called parrot speech.
Detecting Agreement in Multi-party Conversational AI
Today, conversational systems are expected to handle conversations in multi-party settings, especially within Socially Assistive Robots (SARs).
Personalizing Keyword Spotting with Speaker Information
Keyword spotting systems often struggle to generalize to a diverse population with various accents and age groups.
Deep Neural Networks for Automatic Speaker Recognition Do Not Learn Supra-Segmental Temporal Features
While deep neural networks have shown impressive results in automatic speaker recognition and related tasks, it is dissatisfactory how little is understood about what exactly is responsible for these results.
UniX-Encoder: A Universal $X$-Channel Speech Encoder for Ad-Hoc Microphone Array Speech Processing
2) Multi-Task Capability: Beyond the single-task focus of previous systems, UniX-Encoder acts as a robust upstream model, adeptly extracting features for diverse tasks including ASR and speaker recognition.
Privacy-oriented manipulation of speaker representations
Speaker embeddings are ubiquitous, with applications ranging from speaker recognition and diarization to speech synthesis and voice anonymisation.
Thech. Report: Genuinization of Speech waveform PMF for speaker detection spoofing and countermeasures
In this article, we propose an algorithm, denoted genuinization, capable of reducing the waveform distribution gap between authentic speech and spoofing speech.
Disentangling Voice and Content with Self-Supervision for Speaker Recognition
For speaker recognition, it is difficult to extract an accurate speaker representation from speech because of its mixture of speaker traits and content.
Voice Morphing: Two Identities in One Voice
In a biometric system, each biometric sample or template is typically associated with a single identity.