Sequence-To-Sequence Speech Recognition
7 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Sequence-To-Sequence Speech Recognition
Latest papers with no code
RASR2: The RWTH ASR Toolkit for Generic Sequence-to-sequence Speech Recognition
Modern public ASR tools usually provide rich support for training various sequence-to-sequence (S2S) models, but rather simple support for decoding open-vocabulary scenarios only.
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Although popular for classification tasks in vision and language, EE has seen less use for sequence-to-sequence speech recognition (ASR) tasks where outputs from early layers are often degenerate.
Language-agnostic Code-Switching in Sequence-To-Sequence Speech Recognition
Code-Switching (CS) is referred to the phenomenon of alternately using words and phrases from different languages.
Integrating Knowledge into End-to-End Speech Recognition from External Text-Only Data
To alleviate the above two issues, we propose a unified method called LST (Learn Spelling from Teachers) to integrate knowledge into an AED model from the external text-only data and leverage the whole context in a sentence.
Improving sequence-to-sequence speech recognition training with on-the-fly data augmentation
Sequence-to-Sequence (S2S) models recently started to show state-of-the-art performance for automatic speech recognition (ASR).
Unsupervised pre-training for sequence to sequence speech recognition
The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST.
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition
Integrating an external language model into a sequence-to-sequence speech recognition system is non-trivial.
Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions
Coupled with a convolutional language model, our time-depth separable convolution architecture improves by more than 22% relative WER over the best previously reported sequence-to-sequence results on the noisy LibriSpeech test set.
Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition
We investigate the feasibility of sequence-level knowledge distillation of Sequence-to-Sequence (Seq2Seq) models for Large Vocabulary Continuous Speech Recognition (LVSCR).
Analysis of Multilingual Sequence-to-Sequence speech recognition systems
This paper investigates the applications of various multilingual approaches developed in conventional hidden Markov model (HMM) systems to sequence-to-sequence (seq2seq) automatic speech recognition (ASR).