no code implementations • CCL 2022 • Xiaoli Feng, Yingming Gao, Binghuai Lin, Jinson Zhang
“本文引入“熵”对学习者二语音素发音错误的分布情况进行了量化研究。通过对不同音素及不同二语水平学习者音素错误率和错误分散度的分析发现:1. 错误率与错误分散度有较高的相关性, 二者的差异反映出错误分布的差异性;2. 错误率类似的音素中, 与母语音素相似度越高的音素错误分散度越小;3. 较初级水平, 中级水平学习者音素错误率下降而错误分散度上升。由此可见, 熵可以在错误率基础上可以进一步揭示学习者母语音系及二语水平对音素发音错误分散度的影响。”
1 code implementation • 2 Jan 2024 • Jinlong Xue, Yayue Deng, Yingming Gao, Ya Li
Drawing inspiration from state-of-the-art Text-to-Image (T2I) diffusion models, we introduce Auffusion, a TTA system adapting T2I model frameworks to TTA task, by effectively leveraging their inherent generative strengths and precise cross-modal alignment.
Ranked #5 on Audio Generation on AudioCaps
1 code implementation • 27 Dec 2023 • Qifei Li, Yingming Gao, Cong Wang, Yayue Deng, Jinlong Xue, Yichen Han, Ya Li
To address this problem, we propose a frame-level emotional state alignment method for SER.
no code implementations • 16 Dec 2023 • Yayue Deng, Jinlong Xue, Yukang Jia, Qifei Li, Yichen Han, Fengping Wang, Yingming Gao, Dengfeng Ke, Ya Li
In this paper, we introduce a contrastive learning-based CSS framework, CONCSS.
1 code implementation • 28 Aug 2023 • Linkai Peng, Baorian Nuchged, Yingming Gao
Our focus is centered on evaluating the efficacy of LLMs in the realm of education, specifically in the areas of spoken language learning which encompass phonetics, phonology, and second language acquisition.
no code implementations • 3 May 2023 • Jinlong Xue, Yayue Deng, Fengping Wang, Ya Li, Yingming Gao, JianHua Tao, Jianqing Sun, Jiaen Liang
However, it is still a challenge to comprehensively model the conversation, and a majority of conversational TTS systems only focus on extracting global information and omit local prosody features, which contain important fine-grained information like keywords and emphasis.
no code implementations • 7 Oct 2022 • Yichen Han, Ya Li, Yingming Gao, Jinlong Xue, Songpo Wang, Lei Yang
Then we used keypoint decomposition to extract video synthesis controlling parameters from the backend output and the source image.
1 code implementation • 15 Jun 2022 • Linkai Peng, Yingming Gao, Binghuai Lin, Dengfeng Ke, Yanlu Xie, Jinsong Zhang
In the field of assessing the pronunciation quality of constrained speech, the given transcriptions can play the role of a teacher.