Search Results for author: Sining Sun

Found 13 papers, 2 papers with code

Skipformer: A Skip-and-Recover Strategy for Efficient Speech Recognition

no code implementations • 13 Mar 2024 • Wenjing Zhu, Sining Sun, Changhao Shan, Peng Fan, Qing Yang

Conformer-based attention models have become the de facto backbone model for Automatic Speech Recognition tasks.

Automatic Speech Recognition speech-recognition +1

Paper
Add Code

Self-Supervised Disentangled Representation Learning for Robust Target Speech Extraction

no code implementations • 16 Dec 2023 • Zhaoxi Mu, Xinyu Yang, Sining Sun, Qing Yang

However, in the task of target speech extraction, certain elements of global and local semantic information in the reference speech, which are irrelevant to speaker identity, can lead to speaker confusion within the speech extraction network.

Disentanglement Speech Extraction

Paper
Add Code

Key Frame Mechanism For Efficient Conformer Based End-to-end Speech Recognition

1 code implementation • 23 Oct 2023 • Peng Fan, Changhao Shan, Sining Sun, Qing Yang, Jianwei Zhang

Following the initial encoder, we introduce an intermediate CTC loss function to compute the label frame, enabling us to extract the key frames and blank frames for KFSA.

Automatic Speech Recognition speech-recognition +1

Paper
Code

DCCRN-KWS: an audio bias based model for noise robust small-footprint keyword spotting

no code implementations • 21 May 2023 • Shubo Lv, Xiong Wang, Sining Sun, Long Ma, Lei Xie

Real-world complex acoustic environments especially the ones with a low signal-to-noise ratio (SNR) will bring tremendous challenges to a keyword spotting (KWS) system.

Denoising Multi-Task Learning +4

Paper
Add Code

Two Stage Contextual Word Filtering for Context bias in Unified Streaming and Non-streaming Transducer

no code implementations • 17 Jan 2023 • Zhanheng Yang, Sining Sun, Xiong Wang, Yike Zhang, Long Ma, Lei Xie

In this paper, we propose an efficient approach to obtain a high quality contextual list for a unified streaming/non-streaming based E2E model.

Paper
Add Code

Leveraging Acoustic Contextual Representation by Audio-textual Cross-modal Learning for Conversational ASR

no code implementations • 3 Jul 2022 • Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma

Then, during the training of the conversational ASR system, the extractor will be frozen to extract the textual representation of preceding speech, while such representation is used as context fed to the ASR decoder through attention mechanism.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Conversational Speech Recognition By Learning Conversation-level Characteristics

no code implementations • 16 Feb 2022 • Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma

Conversational automatic speech recognition (ASR) is a task to recognize conversational speech including multiple speakers.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Improving Streaming Transformer Based ASR Under a Framework of Self-supervised Learning

no code implementations • 15 Sep 2021 • Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma

Under such a framework, the neural network is usually pre-trained with massive unlabeled data and then fine-tuned with limited labeled data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Multi-head Monotonic Chunkwise Attention For Online Speech Recognition

no code implementations • 1 May 2020 • Baiji Liu, Songjun Cao, Sining Sun, Weibin Zhang, Long Ma

Experiments on AISHELL-1 data show that the proposed model, along with the training strategies, improve the character error rate (CER) of MoChA from 8. 96% to 7. 68% on test set.

speech-recognition Speech Recognition

Paper
Add Code

Training Augmentation with Adversarial Examples for Robust Speech Recognition

no code implementations • 7 Jun 2018 • Sining Sun, Ching-Feng Yeh, Mari Ostendorf, Mei-Yuh Hwang, Lei Xie

This paper explores the use of adversarial examples in training speech recognition systems to increase robustness of deep neural network acoustic models.

Data Augmentation Robust Speech Recognition +1

Paper
Add Code

Domain Adversarial Training for Accented Speech Recognition

no code implementations • 7 Jun 2018 • Sining Sun, Ching-Feng Yeh, Mei-Yuh Hwang, Mari Ostendorf, Lei Xie

In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem.

Accented Speech Recognition Multi-Task Learning +1

Paper
Add Code

Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition

1 code implementation • 27 Mar 2018 • Ke Wang, Junbo Zhang, Sining Sun, Yujun Wang, Fei Xiang, Lei Xie

First, we study the effectiveness of different dereverberation networks (the generator in GAN) and find that LSTM leads a significant improvement as compared with feed-forward DNN and CNN in our dataset.

Robust Speech Recognition Speech Dereverberation +1

Paper
Code

The NNI Query-by-Example System for MediaEval 2015

no code implementations • MediaEval 2015 Workshop 2015 • Jingyong Hou, Van Tung Pham, Cheung-Chi Leung, Lei Wang, HaiHua Xu, Hang Lv, Lei Xie, Zhonghua Fu, Chongjia Ni, Xiong Xiao, Hongjie Chen, Shaofei Zhang, Sining Sun, Yougen Yuan, Pengcheng Li, Tin Lay Nwe, Sunil Sivadas, Bin Ma, Eng Siong Chng, Haizhou Li

This paper describes the system developed by the NNI team for the Query-by-Example Search on Speech Task (QUESST) in the MediaEval 2015 evaluation.

Ranked #9 on Keyword Spotting on QUESST

Keyword Spotting

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.