no code implementations • 16 Dec 2023 • Yayue Deng, Jinlong Xue, Yukang Jia, Qifei Li, Yichen Han, Fengping Wang, Yingming Gao, Dengfeng Ke, Ya Li
In this paper, we introduce a contrastive learning-based CSS framework, CONCSS.
no code implementations • 5 Jun 2023 • Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang, Binghuai Lin
Regressive Text-to-Speech (TTS) system utilizes attention mechanism to generate alignment between text and acoustic feature sequence.
1 code implementation • 15 Jun 2022 • Linkai Peng, Yingming Gao, Binghuai Lin, Dengfeng Ke, Yanlu Xie, Jinsong Zhang
In the field of assessing the pronunciation quality of constrained speech, the given transcriptions can play the role of a teacher.
no code implementations • 6 Aug 2021 • Dengfeng Ke, Yuxing Lu, Xudong Liu, Yanyan Xu, Jing Sun, Cheng-Hao Cai
With the rapid development of neural network architectures and speech processing models, singing voice synthesis with neural networks is becoming the cutting-edge technique of digital music production.
no code implementations • 6 May 2021 • Dengfeng Ke, Jinsong Zhang, Yanlu Xie, Yanyan Xu, Binghuai Lin
With all these modifications, the size of the PHASEN model is shrunk from 33M parameters to 5M parameters, while the performance on VoiceBank+DEMAND is improved to the CSIG score of 4. 30, the PESQ score of 3. 07 and the COVL score of 3. 73.
1 code implementation • 17 Apr 2021 • Kaiqi Fu, Jones Lin, Dengfeng Ke, Yanlu Xie, Jinsong Zhang, Binghuai Lin
Recently, end-to-end mispronunciation detection and diagnosis (MD&D) systems has become a popular alternative to greatly simplify the model-building process of conventional hybrid DNN-HMM systems by representing complicated modules with a single deep network architecture.
1 code implementation • 25 Jun 2020 • Yongqiang Dou, Haocheng Yang, Maolin Yang, Yanyan Xu, Dengfeng Ke
Besides, in the experiments, we select three kinds of features that contain both magnitude-based and phase-based information to form complementary and informative features.
3 code implementations • 17 Apr 2019 • Feiyang Chen, Ziqian Luo, Yanyan Xu, Dengfeng Ke
Therefore, in this paper, based on audio and text, we consider the task of multimodal sentiment analysis and propose a novel fusion strategy including both multi-feature fusion and multi-modality fusion to improve the accuracy of audio-text sentiment analysis.
Multimodal Emotion Recognition Multimodal Sentiment Analysis
no code implementations • 2 May 2018 • Bin Liu, Shuai Nie, Yaping Zhang, Dengfeng Ke, Shan Liang, Wenju Liu1
In realistic environments, speech is usually interfered by various noise and reverberation, which dramatically degrades the performance of automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 28 Oct 2017 • Cheng-Hao Cai, Yanyan Xu, Dengfeng Ke, Kaile Su, Jing Sun
In experiments, it is demonstrated that the revised rules can be used to train a range of functional connections: 20 different functions are applied to neural networks with up to 10 hidden layers, and most of them gain high test accuracies on the MNIST database.
no code implementations • 25 Apr 2017 • Cheng-Hao Cai, Dengfeng Ke, Yanyan Xu, Kaile Su
Briefly, in a reasoning system, a deep feedforward neural network is used to guide rewriting processes after learning from algebraic reasoning examples produced by humans.