no code implementations • ROCLING 2022 • Qiu-Xia Zhang, Te-Yu Chi, Te-Lun Yang, Jyh-Shing Roger Jang
This study uses training and validation data from the “ROCLING 2022 Chinese Health Care Named Entity Recognition Task” for modeling.
1 code implementation • 20 Feb 2024 • Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee
Speech emotion recognition (SER) is a pivotal technology for human-computer interaction systems.
no code implementations • 27 Nov 2023 • Yu-Chen Lin, Akhilesh Kumar, Norman Chang, Wenliang Zhang, Muhammad Zakir, Rucha Apte, Haiyang He, Chao Wang, Jyh-Shing Roger Jang
We present four main contributions to enhance the performance of Large Language Models (LLMs) in generating domain-specific code: (i) utilizing LLM-based data splitting and data renovation techniques to improve the semantic representation of embeddings' space; (ii) introducing the Chain of Density for Renovation Credibility (CoDRC), driven by LLMs, and the Adaptive Text Renovation (ATR) algorithm for assessing data renovation reliability; (iii) developing the Implicit Knowledge Expansion and Contemplation (IKEC) Prompt technique; and (iv) effectively refactoring existing scripts to generate new and high-quality scripts with LLMs.
1 code implementation • 21 Nov 2023 • Jun-You Wang, Chon-In Leong, Yu-Chen Lin, Li Su, Jyh-Shing Roger Jang
With the use of data augmentation and source separation model, results show that the proposed method achieves a character error rate of less than 18% on a Mandarin polyphonic dataset for lyrics transcription, and a mean absolute error of 0. 071 seconds for lyrics alignment.
1 code implementation • 28 Jul 2023 • Te-Yu Chi, Yu-Meng Tang, Chia-Wen Lu, Qiu-Xia Zhang, Jyh-Shing Roger Jang
To achieve this objective, we propose a novel self-training strategy that uses labels rather than text for training, significantly reducing the model's training time.
no code implementations • 16 Feb 2023 • Chung-Che Wang, Yu-Chun Lin, Yu-Teng Hsu, Jyh-Shing Roger Jang
A siamese network is used to compare the inputs and predict the preference.
2 code implementations • 27 Oct 2022 • Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-Yi Lee, Jyh-Shing Roger Jang
This paper proposed an MTDVocaLiST model, which is trained by our proposed multimodal Transformer distillation (MTD) loss.
no code implementations • 3 Oct 2022 • Xuanjun Chen, Haibin Wu, Helen Meng, Hung-Yi Lee, Jyh-Shing Roger Jang
Audio-visual active speaker detection (AVASD) is well-developed, and now is an indispensable front-end for several multi-modal applications.
Adversarial Robustness Audio-Visual Active Speaker Detection
no code implementations • 31 Mar 2022 • Yen-Lun Liao, Xuanjun Chen, Chung-Che Wang, Jyh-Shing Roger Jang
The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent resulting personal information leakage in Automatic Speaker Verification (ASV) system.
1 code implementation • 4 Dec 2018 • Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang
In this paper, we introduce a novel attentional similarity module for the problem of few-shot sound recognition.
Sound Audio and Speech Processing
no code implementations • 31 Oct 2017 • Zhe-Cheng Fan, Yen-Lin Lai, Jyh-Shing Roger Jang
Separating two sources from an audio mixture is an important task with many applications.
no code implementations • ROCLINGIJCLCLP 2012 • Wei-jay Huang, Jhih-rou Lin, Ren-Yuan Lyu, Yuang-chin Chiang, Jyh-Shing Roger Jang, Ming-Tat Ko