Robust Speech Recognition

22 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Sequential Randomized Smoothing for Adversarially Robust Speech Recognition

raphaelolivier/smoothingasr EMNLP 2021

We apply adaptive versions of state-of-the-art attacks, such as the Imperceptible ASR attack, to our model, and show that our strongest defense is robust to all attacks that use inaudible noise, and can only be broken with very high distortion.

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

sinica-slam/kaldi-senan 25 Mar 2022

In this paper, a noise-aware training framework based on two cascaded neural structures is proposed to jointly optimize speech enhancement and speech recognition.

Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition

yuchen005/dpsl-asr 28 Mar 2022

Then, we propose style learning to map the fused feature close to clean feature, in order to learn latent speech information from the latter, i. e., clean "speech style".

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding

espnet/espnet 19 Jul 2022

To showcase such integration, we performed experiments on carefully designed synthetic datasets for noisy-reverberant multi-channel ST and SLU tasks, which can be used as benchmark corpora for future research.

DENT-DDSP: Data-efficient noisy speech generator using differentiable digital signal processors for explicit distortion modelling and noise-robust speech recognition

guozixunnicolas/dent-ddsp 1 Aug 2022

Moreover, to validate whether the data simulated by DENT-DDSP are able to replace the scarce in-domain noisy data in the noise-robust ASR tasks, several downstream ASR models with the same architecture are trained using the simulated data and the real data.

CCC-wav2vec 2.0: Clustering aided Cross Contrastive Self-supervised learning of speech representations

speech-lab-iitm/ccc-wav2vec-2.0 5 Oct 2022

While Self-Supervised Learning has helped reap the benefit of the scale from the available unlabeled data, the learning paradigms are continuously being bettered.

Audio-Visual Efficient Conformer for Robust Speech Recognition

burchim/avec 4 Jan 2023

We improve previous lip reading methods using an Efficient Conformer back-end on top of a ResNet-18 visual front-end and by adding intermediate CTC losses between blocks.

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

yuchen005/gradient-remedy 22 Feb 2023

In this paper, we propose a simple yet effective approach called gradient remedy (GR) to solve interference between task gradients in noise-robust speech recognition, from perspectives of both angle and magnitude.

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

facebookresearch/muavic 1 Mar 2023

We introduce MuAViC, a multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation providing 1200 hours of audio-visual speech in 9 languages.

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

zhuole1025/LyricWhiz 29 Jun 2023

We introduce LyricWhiz, a robust, multilingual, and zero-shot automatic lyrics transcription method achieving state-of-the-art performance on various lyrics transcription datasets, even in challenging genres such as rock and metal.