Search Results for author: Mathieu Fontaine

Found 9 papers, 3 papers with code

GLA-Grad: A Griffin-Lim Extended Waveform Generation Diffusion Model

no code implementations • 9 Feb 2024 • Haocheng Liu, Teysir Baoueb, Mathieu Fontaine, Jonathan Le Roux, Gael Richard

Diffusion models are receiving a growing interest for a variety of signal generation tasks such as speech or music synthesis.

Paper
Add Code

Online speaker diarization of meetings guided by speech separation

1 code implementation • 30 Jan 2024 • Elio Gruttadauria, Mathieu Fontaine, Slim Essid

The results show that our system improves the state-of-the-art on the AMI headset mix, using no oracle information and under full evaluation (no collar and including overlapped speech).

Action Detection Activity Detection +3

Paper
Code

SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis

no code implementations • 30 Jan 2024 • Teysir Baoueb, Haocheng Liu, Mathieu Fontaine, Jonathan Le Roux, Gael Richard

Generative adversarial network (GAN) models can synthesize highquality audio signals while ensuring fast sample generation.

Generative Adversarial Network Speech Synthesis

Paper
Add Code

Resilient Multiple Choice Learning: A learned scoring scheme with application to audio scene analysis

1 code implementation • NeurIPS 2023 • Victor Letzelter, Mathieu Fontaine, Mickaël Chen, Patrick Pérez, Slim Essid, Gaël Richard

Multiple Choice Learning is a simple framework to tackle multimodal density estimation, using the Winner-Takes-All (WTA) loss for a set of hypotheses.

Density Estimation Multiple-choice +1

Paper
Code

Neural Steerer: Novel Steering Vector Synthesis with a Causal Neural Field over Frequency and Source Positions

no code implementations • 8 May 2023 • Diego Di Carlo, Aditya Arie Nugraha, Mathieu Fontaine, Kazuyoshi Yoshii

We address the problem of accurately interpolating measured anechoic steering vectors with a deep learning framework called the neural field.

Novel View Synthesis speech-recognition +1

Paper
Add Code

DNN-Free Low-Latency Adaptive Speech Enhancement Based on Frame-Online Beamforming Powered by Block-Online FastMNMF

no code implementations • 22 Jul 2022 • Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii

Our DNN-free system leverages the posteriors of the latest source spectrograms given by block-online FastMNMF to derive the current source covariance matrices for frame-online beamforming.

blind source separation Speech Enhancement

Paper
Add Code

Direction-Aware Adaptive Online Neural Speech Enhancement with an Augmented Reality Headset in Real Noisy Conversational Environments

1 code implementation • 15 Jul 2022 • Kouhei Sekiguchi, Aditya Arie Nugraha, Yicheng Du, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii

This paper describes the practical response- and performance-aware development of online speech enhancement for an augmented reality (AR) headset that helps a user understand conversations made in real noisy echoic environments (e. g., cocktail party).

blind source separation Speech Enhancement

179

Paper
Code

Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments

no code implementations • 15 Jul 2022 • Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii

This paper describes noisy speech recognition for an augmented reality headset that helps verbal communication within real multiparty conversational environments.

Ranked #1 on Speech Enhancement on EasyCom (SDR metric)

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Generalized Fast Multichannel Nonnegative Matrix Factorization Based on Gaussian Scale Mixtures for Blind Source Separation

no code implementations • 11 May 2022 • Mathieu Fontaine, Kouhei Sekiguchi, Aditya Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii

This paper describes heavy-tailed extensions of a state-of-the-art versatile blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) from a unified point of view.

blind source separation Speech Enhancement

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.