Speaker verification is the verifying the identity of a person from characteristics of the voice.
( Image credit: Contrastive-Predictive-Coding-PyTorch )
Modern automatic speaker verification relies largely on deep neural networks (DNNs) trained on mel-frequency cepstral coefficient (MFCC) features.
Therefore, we propose a self-attentive multi-layer aggregation with feature recalibration and normalization for end-to-end speaker verification system.
Our primary submission to the challenge is the fusion of seven subsystems which yields a normalized minimum detection cost function (minDCF) of 0. 072 and an equal error rate (EER) of 2. 14% on the evaluation set.
This paper extends recent work on nonlinear Independent Component Analysis (ICA) by introducing a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables.
Then, we propose a new method for computing the filter frequency responses by using principal component analysis (PCA).
In this paper we describe the top-scoring IDLab submission for the text-independent task of the Short-duration Speaker Verification (SdSV) Challenge 2020.
Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs.
Finally, we use this pipeline to create a large-scale diarisation dataset called VoxConverse, collected from 'in the wild' videos, which we will release publicly to the research community.