Browse SoTA > Speech > Speaker Identification

Speaker Identification

15 papers with code · Speech

Benchmarks

Greatest papers with code

Speaker Recognition from Raw Waveform with SincNet

29 Jul 2018mravanelli/SincNet

Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants.

SPEAKER IDENTIFICATION SPEAKER RECOGNITION SPEAKER VERIFICATION

Deep Speaker: an End-to-End Neural Speaker Embedding System

5 May 2017philipperemy/deep-speaker

We present Deep Speaker, a neural speaker embedding system that maps utterances to a hypersphere where speaker similarity is measured by cosine similarity.

SPEAKER IDENTIFICATION SPEAKER RECOGNITION

VoxCeleb: a large-scale speaker identification dataset

Interspeech 2018 a-nagrani/VGGVox

Our second contribution is to apply and compare various state of the art speaker identification techniques on our dataset to establish baseline performance.

SPEAKER IDENTIFICATION SPEAKER RECOGNITION SPEAKER VERIFICATION

Generative Pre-Training for Speech with Autoregressive Predictive Coding

23 Oct 2019iamyuanchung/Autoregressive-Predictive-Coding

Learning meaningful and general representations from unannotated speech that are applicable to a wide range of tasks remains challenging.

REPRESENTATION LEARNING SPEAKER IDENTIFICATION SPEECH RECOGNITION TRANSFER LEARNING

AutoSpeech: Neural Architecture Search for Speaker Recognition

7 May 2020TAMU-VITA/AutoSpeech

Speaker recognition systems based on Convolutional Neural Networks (CNNs) are often built with off-the-shelf backbones such as VGG-Net or ResNet.

IMAGE CLASSIFICATION NEURAL ARCHITECTURE SEARCH SPEAKER IDENTIFICATION SPEAKER RECOGNITION SPEAKER VERIFICATION

Speech-VGG: A deep feature extractor for speech processing

22 Oct 2019bepierre/SpeechVGG

While applications of transfer learning are common in the fields of computer vision and natural language processing, audio- and speech processing are surprisingly lacking readily available and transferable models.

LANGUAGE IDENTIFICATION MUSIC CLASSIFICATION REPRESENTATION LEARNING SPEAKER IDENTIFICATION SPEECH ENHANCEMENT TRANSFER LEARNING

Learning Speaker Representations with Mutual Information

1 Dec 2018Js-Mim/rl_singing_voice

Mutual Information (MI) or similar measures of statistical dependence are promising tools for learning these representations in an unsupervised way.

SPEAKER IDENTIFICATION

AM-MobileNet1D: A Portable Model for Speaker Recognition

31 Mar 2020joaoantoniocn/AM-MobileNet1D

To address this demand, we propose a portable model called Additive Margin MobileNet1D (AM-MobileNet1D) to Speaker Identification on mobile devices.

SPEAKER IDENTIFICATION SPEAKER RECOGNITION

On Learning Associations of Faces and Voices

15 May 2018changil/facevoice

We computationally model the overlapping information between faces and voices and show that the learned cross-modal representation contains enough information to identify matching faces and voices with performance similar to that of humans.

SPEAKER IDENTIFICATION