Search Results for author: Jyh-Shing Roger Jang

Found 18 papers, 5 papers with code

CrowNER at Rocling 2022 Shared Task: NER using MacBERT and Adversarial Training

no code implementations • ROCLING 2022 • Qiu-Xia Zhang, Te-Yu Chi, Te-Lun Yang, Jyh-Shing Roger Jang

This study uses training and validation data from the “ROCLING 2022 Chinese Health Care Named Entity Recognition Task” for modeling.

Data Augmentation named-entity-recognition +2

Paper
Add Code

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition

1 code implementation • 20 Feb 2024 • Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee

Speech emotion recognition (SER) is a pivotal technology for human-computer interaction systems.

Self-Supervised Learning Speech Emotion Recognition

Paper
Code

Novel Preprocessing Technique for Data Embedding in Engineering Code Generation Using Large Language Model

no code implementations • 27 Nov 2023 • Yu-Chen Lin, Akhilesh Kumar, Norman Chang, Wenliang Zhang, Muhammad Zakir, Rucha Apte, Haiyang He, Chao Wang, Jyh-Shing Roger Jang

We present four main contributions to enhance the performance of Large Language Models (LLMs) in generating domain-specific code: (i) utilizing LLM-based data splitting and data renovation techniques to improve the semantic representation of embeddings' space; (ii) introducing the Chain of Density for Renovation Credibility (CoDRC), driven by LLMs, and the Adaptive Text Renovation (ATR) algorithm for assessing data renovation reliability; (iii) developing the Implicit Knowledge Expansion and Contemplation (IKEC) Prompt technique; and (iv) effectively refactoring existing scripts to generate new and high-quality scripts with LLMs.

Code Generation Language Modelling +2

Paper
Add Code

Adapting pretrained speech model for Mandarin lyrics transcription and alignment

1 code implementation • 21 Nov 2023 • Jun-You Wang, Chon-In Leong, Yu-Chen Lin, Li Su, Jyh-Shing Roger Jang

With the use of data augmentation and source separation model, results show that the proposed method achieves a character error rate of less than 18% on a Mandarin polyphonic dataset for lyrics transcription, and a mean absolute error of 0. 071 seconds for lyrics alignment.

Automatic Lyrics Transcription Data Augmentation

Paper
Code

WC-SBERT: Zero-Shot Text Classification via SBERT with Self-Training for Wikipedia Categories

1 code implementation • 28 Jul 2023 • Te-Yu Chi, Yu-Meng Tang, Chia-Wen Lu, Qiu-Xia Zhang, Jyh-Shing Roger Jang

To achieve this objective, we propose a novel self-training strategy that uses labels rather than text for training, significantly reducing the model's training time.

text-classification Text Classification +1

Paper
Code

Personalized Audio Quality Preference Prediction

no code implementations • 16 Feb 2023 • Chung-Che Wang, Yu-Chun Lin, Yu-Teng Hsu, Jyh-Shing Roger Jang

A siamese network is used to compare the inputs and predict the preference.

Decoder

Paper
Add Code

Multimodal Transformer Distillation for Audio-Visual Synchronization

2 code implementations • 27 Oct 2022 • Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-Yi Lee, Jyh-Shing Roger Jang

This paper proposed an MTDVocaLiST model, which is trained by our proposed multimodal Transformer distillation (MTD) loss.

Audio-Visual Synchronization

Paper
Code

Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection

no code implementations • 3 Oct 2022 • Xuanjun Chen, Haibin Wu, Helen Meng, Hung-Yi Lee, Jyh-Shing Roger Jang

Audio-visual active speaker detection (AVASD) is well-developed, and now is an indispensable front-end for several multi-modal applications.

Adversarial Robustness Audio-Visual Active Speaker Detection

Paper
Add Code

Adversarial Speaker Distillation for Countermeasure Model on Automatic Speaker Verification

no code implementations • 31 Mar 2022 • Yen-Lun Liao, Xuanjun Chen, Chung-Che Wang, Jyh-Shing Roger Jang

The countermeasure (CM) model is developed to protect ASV systems from spoof attacks and prevent resulting personal information leakage in Automatic Speaker Verification (ASV) system.

Knowledge Distillation Speaker Verification