1 code implementation • 6 Mar 2024 • Chao-Wei Huang, Chen-An Li, Tsu-Yuan Hsu, Chen-Yu Hsu, Yun-Nung Chen
Dense retrieval methods have demonstrated promising performance in multilingual information retrieval, where queries and documents can be in different languages.
no code implementations • 25 Sep 2023 • Chun-Yi Kuan, Chen An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-Yiin Chang, Hung-Yi Lee
This paper introduces a novel voice conversion (VC) model, guided by text instructions such as "articulate slowly with a deep tone" or "speak in a cheerful boyish voice".
1 code implementation • 13 Sep 2023 • Chao-Wei Huang, Chen-Yu Hsu, Tsu-Yuan Hsu, Chen-An Li, Yun-Nung Chen
Conversational search provides a natural interface for information retrieval (IR).
no code implementations • 12 May 2023 • Yu-Kuan Fu, Liang-Hsuan Tseng, Jiatong Shi, Chen-An Li, Tsu-Yuan Hsu, Shinji Watanabe, Hung-Yi Lee
We use fully unpaired data to train our unsupervised systems and evaluate our results on CoVoST 2 and CVSS.
no code implementations • 24 Feb 2023 • Kuan-Po Huang, Tzu-hsun Feng, Yu-Kuan Fu, Tsu-Yuan Hsu, Po-Chieh Yen, Wei-Cheng Tseng, Kai-Wei Chang, Hung-Yi Lee
We tried two different aggregation techniques, layerwise-average and layerwise-concatenation, to the representations of different teacher models and found that the former was more effective.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 29 Nov 2022 • Tsu-Yuan Hsu, Chen-An Li, Tung-Yu Wu, Hung-Yi Lee
In the first stage, SSL is conducted on the large-scale unlabeled corpus to pre-train a small speech model.
1 code implementation • 14 Oct 2022 • Kuan-Po Huang, Yu-Kuan Fu, Tsu-Yuan Hsu, Fabian Ritter Gutierrez, Fan-Lin Wang, Liang-Hsuan Tseng, Yu Zhang, Hung-Yi Lee
Self-supervised learned (SSL) speech pre-trained models perform well across various speech processing tasks.
1 code implementation • 26 Sep 2022 • Tung-Yu Wu, Chen-An Li, Tzu-Han Lin, Tsu-Yuan Hsu, Hung-Yi Lee
Extensive experiments on speech and non-speech audio datasets are conducted to investigate the representation abilities of our ensemble method and its single constituent model.