Search Results for author: Ruchao Fan

Found 16 papers, 1 papers with code

Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR

1 code implementation28 Apr 2023 Ruchao Fan, Yunzheng Zhu, Jinhan Wang, Abeer Alwan

With the proposed methods (E-APC and DRAFT), the relative WER improvements are even larger (30% and 19% on the OGI and MyST data, respectively) when compared to the models without using pretraining methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

A CTC Alignment-based Non-autoregressive Transformer for End-to-end Automatic Speech Recognition

no code implementations15 Apr 2023 Ruchao Fan, Wei Chu, Peng Chang, Abeer Alwan

During inference, an error-based alignment sampling method is investigated in depth to reduce the alignment mismatch in the training and testing processes.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding

no code implementations16 Oct 2022 Ruchao Fan, Guoli Ye, Yashesh Gaur, Jinyu Li

As a result, we reduce the WER of a streaming TT from 7. 6% to 6. 5% on the Librispeech test-other data and the CER from 7. 3% to 6. 1% on the Aishell test data, respectively.

Language Modelling speech-recognition +1

CTCBERT: Advancing Hidden-unit BERT with CTC Objectives

no code implementations16 Oct 2022 Ruchao Fan, Yiming Wang, Yashesh Gaur, Jinyu Li

We examine CTCBERT on IDs from HuBERT Iter1, HuBERT Iter2, and PBERT.

DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children's ASR

no code implementations16 Jun 2022 Ruchao Fan, Abeer Alwan

However, models trained through SSL are biased to the pretraining data which is usually different from the data used in finetuning tasks, causing a domain shifting problem, and thus resulting in limited knowledge transfer.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects

no code implementations19 Feb 2022 Alexander Johnson, Ruchao Fan, Robin Morris, Abeer Alwan

This paper proposes a novel linear prediction coding-based data aug-mentation method for children's low and zero resource dialect ASR.

Data Augmentation

An Improved Single Step Non-autoregressive Transformer for Automatic Speech Recognition

no code implementations18 Jun 2021 Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan

For the analyses, we plot attention weight distributions in the decoders to visualize the relationships between token-level acoustic embeddings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Fundamental Frequency Feature Normalization and Data Augmentation for Child Speech Recognition

no code implementations18 Feb 2021 Gary Yeung, Ruchao Fan, Abeer Alwan

Because of the lack of publicly available young child speech data, feature extraction strategies such as feature normalization and data augmentation must be considered to successfully train child ASR systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

CASS-NAT: CTC Alignment-based Single Step Non-autoregressive Transformer for Speech Recognition

no code implementations28 Oct 2020 Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao

The information are used to extract acoustic representation for each token in parallel, referred to as token-level acoustic embedding which substitutes the word embedding in autoregressive transformer (AT) to achieve parallel generation in decoder.

speech-recognition Speech Recognition

Exploring the Use of an Unsupervised Autoregressive Model as a Shared Encoder for Text-Dependent Speaker Verification

no code implementations8 Aug 2020 Vijay Ravi, Ruchao Fan, Amber Afshan, Huanhua Lu, Abeer Alwan

A fusion of the x-vector/PLDA baseline and the SID/PLDA scores prior to PID fusion further improved performance by 15% indicating complementarity of the proposed approach to the x-vector system.

Text-Dependent Speaker Verification

An Online Attention-based Model for Speech Recognition

no code implementations13 Nov 2018 Ruchao Fan, Pan Zhou, Wei Chen, Jia Jia, Gang Liu

In previous work, researchers have shown that such architectures can acquire comparable results to state-of-the-art ASR systems, especially when using a bidirectional encoder and global soft attention (GSA) mechanism.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.