Search Results for author: Emmanuel Vincent

Found 32 papers, 12 papers with code

Transformer versus LSTM Language Models trained on Uncertain ASR Hypotheses in Limited Data Scenarios

no code implementations • LREC 2022 • Imran Sheikh, Emmanuel Vincent, Irina Illina

Training of LSTM LMs in such limited data scenarios can benefit from alternate uncertain ASR hypotheses, as observed in our recent work.

Paper
Add Code

Adapting Language Models When Training on Privacy-Transformed Data

no code implementations • LREC 2022 • Tugtekin Turan, Dietrich Klakow, Emmanuel Vincent, Denis Jouvet

In recent years, voice-controlled personal assistants have revolutionized the interaction with smart devices and mobile applications.

Paper
Add Code

The VoicePrivacy 2024 Challenge Evaluation Plan

1 code implementation • 3 Apr 2024 • Natalia Tomashenko, Xiaoxiao Miao, Pierre Champion, Sarina Meyer, Xin Wang, Emmanuel Vincent, Michele Panariello, Nicholas Evans, Junichi Yamagishi, Massimiliano Todisco

The task of the challenge is to develop a voice anonymization system for speech data which conceals the speaker's voice identity while protecting linguistic content and emotional states.

Paper
Code

Improving Speaker Assignment in Speaker-Attributed ASR for Real Meeting Applications

no code implementations • 11 Mar 2024 • Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

Past studies on end-to-end meeting transcription have focused on model architecture and have mostly been evaluated on simulated meeting data.

Action Detection Activity Detection +2

Paper
Add Code

End-to-end Joint Rich and Normalized ASR with a limited amount of rich training data

no code implementations • 29 Nov 2023 • Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

End-to-end (E2E) ASR models offer both convenience and the ability to perform such joint transcription of speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

End-to-end Multichannel Speaker-Attributed ASR: Speaker Guided Decoder and Input Feature Analysis

no code implementations • 16 Oct 2023 • Can Cui, Imran Ahamad Sheikh, Mostafa Sadeghi, Emmanuel Vincent

We present an end-to-end multichannel speaker-attributed automatic speech recognition (MC-SA-ASR) system that combines a Conformer-based encoder with multi-frame crosschannel attention and a speaker-attributed Transformer-based decoder.

Automatic Speech Recognition Speaker Identification +2

Paper
Add Code

Stochastic Pitch Prediction Improves the Diversity and Naturalness of Speech in Glow-TTS

1 code implementation • 28 May 2023 • Sewade Ogun, Vincent Colotte, Emmanuel Vincent

Flow-based generative models are widely used in text-to-speech (TTS) systems to learn the distribution of audio features (e. g., Mel-spectrograms) given the input tokens and to sample from this distribution to generate diverse utterances.

Zero-Shot Multi-Speaker TTS

Paper
Code

Can we use Common Voice to train a Multi-Speaker TTS system?

1 code implementation • 12 Oct 2022 • Sewade Ogun, Vincent Colotte, Emmanuel Vincent

We show the viability of this approach for training a multi-speaker GlowTTS model on the Common Voice English dataset.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

The VoicePrivacy 2020 Challenge Evaluation Plan

1 code implementation • 14 May 2022 • Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco

The VoicePrivacy Challenge aims to promote the development of privacy preservation tools for speech technology by gathering a new community to define the tasks of interest and the evaluation methodology, and benchmarking solutions through a series of challenges.

Benchmarking

Paper
Code

The VoicePrivacy 2022 Challenge Evaluation Plan

1 code implementation • 23 Mar 2022 • Natalia Tomashenko, Xin Wang, Xiaoxiao Miao, Hubert Nourtel, Pierre Champion, Massimiliano Todisco, Emmanuel Vincent, Nicholas Evans, Junichi Yamagishi, Jean-François Bonastre

Participants apply their developed anonymization systems, run evaluation scripts and submit objective evaluation results and anonymized speech data to the organizers.

Speaker Verification

Paper
Code

Differentially Private Speaker Anonymization

no code implementations • 23 Feb 2022 • Ali Shahin Shamsabadi, Brij Mohan Lal Srivastava, Aurélien Bellet, Nathalie Vauquier, Emmanuel Vincent, Mohamed Maouche, Marc Tommasi, Nicolas Papernot

We remove speaker information from these attributes by introducing differentially private feature extractors based on an autoencoder and an automatic speech recognizer, respectively, trained using noise layers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

The VoicePrivacy 2020 Challenge: Results and findings

1 code implementation • 1 Sep 2021 • Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Jose Patino, Brij Mohan Lal Srivastava, Paul-Gauthier Noé, Andreas Nautsch, Nicholas Evans, Junichi Yamagishi, Benjamin O'Brien, Anaïs Chanclu, Jean-François Bonastre, Massimiliano Todisco, Mohamed Maouche

We provide a systematic overview of the challenge design with an analysis of submitted systems and evaluation results.

Paper
Code

Blind Room Parameter Estimation Using Multiple-Multichannel Speech Recordings

1 code implementation • 29 Jul 2021 • Prerak Srivastava, Antoine Deleforge, Emmanuel Vincent

Knowing the geometrical and acoustical parameters of a room may benefit applications such as audio augmented reality, speech dereverberation or audio forensics.

Speech Dereverberation

Paper
Code

UIAI System for Short-Duration Speaker Verification Challenge 2020

no code implementations • 26 Jul 2020 • Md Sahidullah, Achintya Kumar Sarkar, Ville Vestman, Xuechen Liu, Romain Serizel, Tomi Kinnunen, Zheng-Hua Tan, Emmanuel Vincent

Our primary submission to the challenge is the fusion of seven subsystems which yields a normalized minimum detection cost function (minDCF) of 0. 072 and an equal error rate (EER) of 2. 14% on the evaluation set.

Text-Dependent Speaker Verification

Paper
Add Code

LibriMix: An Open-Source Dataset for Generalizable Speech Separation

5 code implementations • 22 May 2020 • Joris Cosentino, Manuel Pariente, Samuele Cornell, Antoine Deleforge, Emmanuel Vincent

Most deep learning-based speech separation models today are benchmarked on it.

Audio and Speech Processing

2,108

Paper
Code

Design Choices for X-vector Based Speaker Anonymization

no code implementations • 18 May 2020 • Brij Mohan Lal Srivastava, Natalia Tomashenko, Xin Wang, Emmanuel Vincent, Junichi Yamagishi, Mohamed Maouche, Aurélien Bellet, Marc Tommasi

The recently proposed x-vector based anonymization scheme converts any input voice into that of a random pseudo-speaker.

Speaker Verification

Paper
Add Code

Foreground-Background Ambient Sound Scene Separation

no code implementations • 11 May 2020 • Michel Olvera, Emmanuel Vincent, Romain Serizel, Gilles Gasso

Ambient sound scenes typically comprise multiple short events occurring on top of a somewhat stationary background.

Paper
Add Code

Introducing the VoicePrivacy Initiative

3 code implementations • 4 May 2020 • Natalia Tomashenko, Brij Mohan Lal Srivastava, Xin Wang, Emmanuel Vincent, Andreas Nautsch, Junichi Yamagishi, Nicholas Evans, Jose Patino, Jean-François Bonastre, Paul-Gauthier Noé, Massimiliano Todisco

The VoicePrivacy initiative aims to promote the development of privacy preservation tools for speech technology by gathering a new community to define the tasks of interest and the evaluation methodology, and benchmarking solutions through a series of challenges.

Benchmarking

Paper
Code

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

no code implementations • 20 Apr 2020 • Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6).

speaker-diarization Speaker Diarization +4

Paper
Add Code

Limitations of weak labels for embedding and tagging

1 code implementation • 5 Feb 2020 • Nicolas Turpault, Romain Serizel, Emmanuel Vincent

Many datasets and approaches in ambient sound analysis use weakly labeled data. Weak labels are employed because annotating every data sample with a strong label is too expensive. Yet, their impact on the performance in comparison to strong labels remains unclear. Indeed, weak labels must often be dealt with at the same time as other challenges, namely multiple labels per sample, unbalanced classes and/or overlapping events. In this paper, we formulate a supervised learning problem which involves weak labels. We create a dataset that focuses on the difference between strong and weak labels as opposed to other challenges.

Paper
Code

Joint NN-Supported Multichannel Reduction of Acoustic Echo, Reverberation and Noise

no code implementations • 20 Nov 2019 • Guillaume Carbajal, Romain Serizel, Emmanuel Vincent, Eric Humbert

We consider the problem of simultaneous reduction of acoustic echo, reverberation and noise.

Paper
Add Code

Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?

no code implementations • 12 Nov 2019 • Brij Mohan Lal Srivastava, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent

In this paper, we focus on the protection of speaker identity and study the extent to which users can be recognized based on the encoded representation of their speech as obtained by a deep encoder-decoder architecture trained for ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Evaluating Voice Conversion-based Privacy Protection against Informed Attackers

no code implementations • 10 Nov 2019 • Brij Mohan Lal Srivastava, Nathalie Vauquier, Md Sahidullah, Aurélien Bellet, Marc Tommasi, Emmanuel Vincent

In this paper, we investigate anonymization methods based on voice conversion.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

The Speed Submission to DIHARD II: Contributions & Lessons Learned

no code implementations • 6 Nov 2019 • Md Sahidullah, Jose Patino, Samuele Cornell, Ruiqing Yin, Sunit Sivasankaran, Hervé Bredin, Pavel Korshunov, Alessio Brutti, Romain Serizel, Emmanuel Vincent, Nicholas Evans, Sébastien Marcel, Stefano Squartini, Claude Barras

This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team.

Action Detection Activity Detection +4

Paper
Add Code

Filterbank design for end-to-end speech separation

2 code implementations • 23 Oct 2019 • Manuel Pariente, Samuele Cornell, Antoine Deleforge, Emmanuel Vincent

Also, we validate the use of parameterized filterbanks and show that complex-valued representations and masks are beneficial in all conditions.

Speaker Recognition Speech Separation

2,108

Paper
Code

Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition

no code implementations • 16 Oct 2019 • Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze

The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

SemEval-2019 Task 4: Hyperpartisan News Detection

no code implementations • SEMEVAL 2019 • Johannes Kiesel, Maria Mestre, Rishabh Shukla, Emmanuel Vincent, Payam Adineh, David Corney, Benno Stein, Martin Potthast

Hyperpartisan news is news that takes an extreme left-wing or right-wing standpoint.

Open-Ended Question Answering valid

Paper
Add Code

AI in the media and creative industries

no code implementations • 10 May 2019 • Giuseppe Amato, Malte Behrmann, Frédéric Bimbot, Baptiste Caramiaux, Fabrizio Falchi, Ander Garcia, Joost Geurts, Jaume Gibert, Guillaume Gravier, Hadmut Holken, Hartmut Koenitz, Sylvain Lefebvre, Antoine Liutkus, Fabien Lotte, Andrew Perkis, Rafael Redondo, Enrico Turrin, Thierry Vieville, Emmanuel Vincent

Thanks to the Big Data revolution and increasing computing capacities, Artificial Intelligence (AI) has made an impressive revival over the past few years and is now omnipresent in both research and industry.

Paper
Add Code

A Statistically Principled and Computationally Efficient Approach to Speech Enhancement using Variational Autoencoders

no code implementations • 3 May 2019 • Manuel Pariente, Antoine Deleforge, Emmanuel Vincent

Recent studies have explored the use of deep generative models of speech spectra based of variational autoencoders (VAEs), combined with unsupervised noise models, to perform speech enhancement.

Speech Enhancement Variational Inference

Paper
Add Code

An improved uncertainty propagation method for robust i-vector based speaker recognition

no code implementations • 15 Feb 2019 • Dayana Ribas, Emmanuel Vincent

So far, different uncertainty propagation methods have been proposed to compensate noise and reverberation in i-vectors in the context of speaker recognition.

Speaker Recognition Speaker Verification +1

Paper
Add Code

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

no code implementations • 28 Mar 2018 • Jon Barker, Shinji Watanabe, Emmanuel Vincent, Jan Trmal

The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing , and machine learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Rank-1 Constrained Multichannel Wiener Filter for Speech Recognition in Noisy Environments

1 code implementation • 1 Jul 2017 • Ziteng Wang, Emmanuel Vincent, Romain Serizel, Yonghong Yan

Multichannel linear filters, such as the Multichannel Wiener Filter (MWF) and the Generalized Eigenvalue (GEV) beamformer are popular signal processing techniques which can improve speech recognition performance.

speech-recognition Speech Recognition

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.