Robust Speech Recognition

22 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Latest papers with no code

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

no code yet • 21 Mar 2024

It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre-training schemes.

KinSPEAK: Improving speech recognition for Kinyarwanda via semi-supervised learning methods

no code yet • 23 Aug 2023

In this work, we show that using self-supervised pre-training, following a simple curriculum schedule during fine-tuning and using semi-supervised learning to leverage large unlabelled speech data significantly improve speech recognition performance for Kinyarwanda.

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

no code yet • 23 Jun 2023

The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems.

Statistical Beamformer Exploiting Non-stationarity and Sparsity with Spatially Constrained ICA for Robust Speech Recognition

no code yet • 13 Jun 2023

In this paper, we present a statistical beamforming algorithm as a pre-processing step for robust automatic speech recognition (ASR).

RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain

no code yet • 6 Jun 2023

Despite the recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional speech in noisy and reverberant acoustic environments.

Incorporating L2 Phonemes Using Articulatory Features for Robust Speech Recognition

no code yet • 5 Jun 2023

Furthermore, fine-tuning on L2 speech improves recognition accuracy for both L1 and L2 speech without performance trade-offs.

AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR

no code yet • CVPR 2023

(ii) We also introduce a simple curriculum scheme during training which we show is crucial to enable the model to jointly process audio and visual information effectively; and finally (iii) we show that our model achieves state of the art zero-shot results on three different AV-ASR benchmarks (How2, VisSpeech and Ego4D), while also crucially preserving decent performance on traditional audio-only speech recognition benchmarks (LibriSpeech).

pMCT: Patched Multi-Condition Training for Robust Speech Recognition

no code yet • 11 Jul 2022

For analyses on robust ASR, we employed pMCT on the VOiCES dataset which is a noisy reverberant dataset created using utterances from LibriSpeech.

On monoaural speech enhancement for automatic recognition of real noisy speech using mixture invariant training

no code yet • 3 May 2022

In this paper, we explore an improved framework to train a monoaural neural enhancement model for robust speech recognition.

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

no code yet • 1 Apr 2022

This work presents our end-to-end (E2E) automatic speech recognition (ASR) model targetting at robust speech recognition, called Integraded speech Recognition with enhanced speech Input for Self-supervised learning representation (IRIS).