Search Results for author: Ruchao Fan

Found 16 papers, 1 papers with code

UniEnc-CASSNAT: An Encoder-only Non-autoregressive ASR for Speech SSL Models

no code implementations • 14 Feb 2024 • Ruchao Fan, Natarajan Balaji Shanka, Abeer Alwan

UniEnc-CASSNAT consists of only an encoder as the major module, which can be the SFM.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR

1 code implementation • 28 Apr 2023 • Ruchao Fan, Yunzheng Zhu, Jinhan Wang, Abeer Alwan

With the proposed methods (E-APC and DRAFT), the relative WER improvements are even larger (30% and 19% on the OGI and MyST data, respectively) when compared to the models without using pretraining methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition

no code implementations • 15 Apr 2023 • Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan

During inference, an error-based alignment sampling method is investigated in depth to reduce the alignment mismatch in the training and testing processes.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding

no code implementations • 16 Oct 2022 • Ruchao Fan, Guoli Ye, Yashesh Gaur, Jinyu Li

As a result, we reduce the WER of a streaming TT from 7. 6% to 6. 5% on the Librispeech test-other data and the CER from 7. 3% to 6. 1% on the Aishell test data, respectively.

Language Modelling speech-recognition +1

Paper
Add Code

CTCBERT: Advancing Hidden-unit BERT with CTC Objectives

no code implementations • 16 Oct 2022 • Ruchao Fan, Yiming Wang, Yashesh Gaur, Jinyu Li

We examine CTCBERT on IDs from HuBERT Iter1, HuBERT Iter2, and PBERT.

Paper
Add Code

DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR

no code implementations • 16 Jun 2022 • Ruchao Fan, Abeer Alwan

However, models trained through SSL are biased to the pretraining data which is usually different from the data used in finetuning tasks, causing a domain shifting problem, and thus resulting in limited knowledge transfer.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Towards Better Meta-Initialization with Task Augmentation for Kindergarten-aged Speech Recognition

no code implementations • 24 Feb 2022 • Yunzheng Zhu, Ruchao Fan, Abeer Alwan

When data are scarce, the model might overfit to the training data, and hence good starting points for training are essential.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects

no code implementations • 19 Feb 2022 • Alexander Johnson, Ruchao Fan, Robin Morris, Abeer Alwan

This paper proposes a novel linear prediction coding-based data aug-mentation method for children's low and zero resource dialect ASR.

Data Augmentation

Paper
Add Code

Low Resource German ASR with Untranscribed Data Spoken by Non-native Children -- INTERSPEECH 2021 Shared Task SPAPL System

no code implementations • 18 Jun 2021 • Jinhan Wang, Yunzheng Zhu, Ruchao Fan, Wei Chu, Abeer Alwan

~ 5 hours of transcribed data and ~ 60 hours of untranscribed data are provided to develop a German ASR system for children.

Acoustic Modelling Automatic Speech Recognition +4

Paper
Add Code

An Improved Single Step Non-autoregressive Transformer for Automatic Speech Recognition

no code implementations • 18 Jun 2021 • Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan

For the analyses, we plot attention weight distributions in the decoders to visualize the relationships between token-level acoustic embeddings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition

no code implementations • 18 Feb 2021 • Gary Yeung, Ruchao Fan, Abeer Alwan

Because of the lack of publicly available young child speech data, feature extraction strategies such as feature normalization and data augmentation must be considered to successfully train child ASR systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Bi-APC: Bidirectional Autoregressive Predictive Coding for Unsupervised Pre-training and Its Application to Children's ASR

no code implementations • 12 Feb 2021 • Ruchao Fan, Amber Afshan, Abeer Alwan

We present a bidirectional unsupervised model pre-training (UPT) method and apply it to children's automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition

no code implementations • 28 Oct 2020 • Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao

The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding which substitutes the word embedding in autoregressive transformer (AT) to achieve parallel generation in decoder.

speech-recognition Speech Recognition

Paper
Add Code

Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification

no code implementations • 8 Aug 2020 • Vijay Ravi, Ruchao Fan, Amber Afshan, Huanhua Lu, Abeer Alwan

A fusion of the x-vector/PLDA baseline and the SID/PLDA scores prior to PID fusion further improved performance by 15% indicating complementarity of the proposed approach to the x-vector system.

Text-Dependent Speaker Verification

Paper
Add Code

Improving Generalization of Transformer for Speech Recognition with Parallel Schedule Sampling and Relative Positional Embedding

no code implementations • 1 Nov 2019 • Pan Zhou, Ruchao Fan, Wei Chen, Jia Jia

Transformer has shown promising results in many sequence to sequence transformation tasks recently.

speech-recognition Speech Recognition

Paper
Add Code

An Online Attention-based Model for Speech Recognition

no code implementations • 13 Nov 2018 • Ruchao Fan, Pan Zhou, Wei Chen, Jia Jia, Gang Liu

In previous work, researchers have shown that such architectures can acquire comparable results to state-of-the-art ASR systems, especially when using a bidirectional encoder and global soft attention (GSA) mechanism.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.