Search Results for author: Jian-Hua Tao

Found 23 papers, 2 papers with code

Spike-Triggered Non-Autoregressive Transformer for End-to-End Speech Recognition

no code implementations16 May 2020 Zhengkun Tian, Jiangyan Yi, Jian-Hua Tao, Ye Bai, Shuai Zhang, Zhengqi Wen

To address this problem and improve the inference speed, we propose a spike-triggered non-autoregressive transformer model for end-to-end speech recognition, which introduces a CTC module to predict the length of the target sequence and accelerate the convergence.

Machine Translation speech-recognition +2

Simultaneous Denoising and Dereverberation Using Deep Embedding Features

no code implementations6 Apr 2020 Cunhang Fan, Jian-Hua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen

In this paper, we propose a joint training method for simultaneous speech denoising and dereverberation using deep embedding features, which is based on the deep clustering (DC).

Clustering Deep Clustering +4

Deep Attention Fusion Feature for Speech Separation with End-to-End Post-filter Method

no code implementations17 Mar 2020 Cunhang Fan, Jian-Hua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu

Secondly, to pay more attention to the outputs of the pre-separation stage, an attention module is applied to acquire deep attention fusion features, which are extracted by computing the similarity between the mixture and the pre-separated speech.

Deep Attention Speech Separation

Rnn-transducer with language bias for end-to-end Mandarin-English code-switching speech recognition

no code implementations19 Feb 2020 Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Jian-Hua Tao, Ye Bai

Recently, language identity information has been utilized to improve the performance of end-to-end code-switching (CS) speech recognition.

Language Identification speech-recognition +1

Synchronous Transformers for End-to-End Speech Recognition

no code implementations6 Dec 2019 Zhengkun Tian, Jiangyan Yi, Ye Bai, Jian-Hua Tao, Shuai Zhang, Zhengqi Wen

Once a fixed-length chunk of the input sequence is processed by the encoder, the decoder begins to predict symbols immediately.

speech-recognition Speech Recognition

Integrating Knowledge into End-to-End Speech Recognition from External Text-Only Data

no code implementations4 Dec 2019 Ye Bai, Jiangyan Yi, Jian-Hua Tao, Zhengqi Wen, Zhengkun Tian, Shuai Zhang

To alleviate the above two issues, we propose a unified method called LST (Learn Spelling from Teachers) to integrate knowledge into an AED model from the external text-only data and leverage the whole context in a sentence.

Language Modelling Sentence +2

Conversational Emotion Analysis via Attention Mechanisms

no code implementations24 Oct 2019 Zheng Lian, Jian-Hua Tao, Bin Liu, Jian Huang

Different from the emotion recognition in individual utterances, we propose a multimodal learning framework using relation and dependencies among the utterances for conversational emotion analysis.

Emotion Recognition

Domain adversarial learning for emotion recognition

no code implementations24 Oct 2019 Zheng Lian, Jian-Hua Tao, Bin Liu, Jian Huang

The secondary task is to learn a common representation where speaker identities can not be distinguished.

Emotion Recognition

Expression Analysis Based on Face Regions in Read-world Conditions

no code implementations23 Oct 2019 Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang, Ming-Yue Niu

To sum up, the contributions of this paper lie in two areas: 1) We visualize concerned areas of human faces in emotion recognition; 2) We analyze the contribution of different face areas to different emotions in real-world conditions through experimental analysis.

Facial Emotion Recognition Facial Expression Recognition +1

Speech Emotion Recognition via Contrastive Loss under Siamese Networks

no code implementations23 Oct 2019 Zheng Lian, Ya Li, Jian-Hua Tao, Jian Huang

It outperforms the baseline system that is optimized without the contrastive loss function with 1. 14% and 2. 55% in the weighted accuracy and the unweighted accuracy, respectively.

feature selection Speech Emotion Recognition

Self-Attention Transducers for End-to-End Speech Recognition

no code implementations28 Sep 2019 Zhengkun Tian, Jiangyan Yi, Jian-Hua Tao, Ye Bai, Zhengqi Wen

Furthermore, a path-aware regularization is proposed to assist SA-T to learn alignments and improve the performance.

speech-recognition Speech Recognition

Discriminative Learning for Monaural Speech Separation Using Deep Embedding Features

no code implementations23 Jul 2019 Cunhang Fan, Bin Liu, Jian-Hua Tao, Jiangyan Yi, Zhengqi Wen

Firstly, a DC network is trained to extract deep embedding features, which contain each source's information and have an advantage in discriminating each target speakers.

Clustering Deep Clustering +1

Forward-Backward Decoding for Regularizing End-to-End TTS

1 code implementation18 Jul 2019 Yibin Zheng, Xi Wang, Lei He, Shifeng Pan, Frank K. Soong, Zhengqi Wen, Jian-Hua Tao

Experimental results show our proposed methods especially the second one (bidirectional decoder regularization), leads a significantly improvement on both robustness and overall naturalness, as outperforming baseline (the revised version of Tacotron2) with a MOS gap of 0. 14 in a challenging test, and achieving close to human quality (4. 42 vs. 4. 49 in MOS) on general test.

Reinforcement Learning Based Emotional Editing Constraint Conversation Generation

no code implementations17 Apr 2019 Jia Li, Xiao Sun, Xing Wei, Changliang Li, Jian-Hua Tao

In recent years, the generation of conversation content based on deep neural networks has attracted many researchers.

Multi-Task Learning reinforcement-learning +1

Audio Visual Emotion Recognition with Temporal Alignment and Perception Attention

no code implementations28 Mar 2016 Linlin Chao, Jian-Hua Tao, Minghao Yang, Ya Li, Zhengqi Wen

The other one is locating and re-weighting the perception attentions in the whole audio-visual stream for better recognition.

Classification Emotion Recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.