Search Results for author: Cheng-I Lai

Found 13 papers, 9 papers with code

A Large-Scale Evaluation of Speech Foundation Models

no code implementations • 15 Apr 2024 • Shu-wen Yang, Heng-Jui Chang, Zili Huang, Andy T. Liu, Cheng-I Lai, Haibin Wu, Jiatong Shi, Xuankai Chang, Hsiang-Sheng Tsai, Wen-Chin Huang, Tzu-hsun Feng, Po-Han Chi, Yist Y. Lin, Yung-Sung Chuang, Tzu-Hsien Huang, Wei-Cheng Tseng, Kushal Lakhotia, Shang-Wen Li, Abdelrahman Mohamed, Shinji Watanabe, Hung-Yi Lee

In this work, we establish the Speech processing Universal PERformance Benchmark (SUPERB) to study the effectiveness of the paradigm for speech.

Benchmarking

Paper
Add Code

Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences

1 code implementation • 15 Mar 2023 • Yuan Tseng, Cheng-I Lai, Hung-Yi Lee

The goal is to determine the spoken sentences' hierarchical syntactic structure in the form of constituency parse trees, such that each node is a span of audio that corresponds to a constituent.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing

1 code implementation • 2 Nov 2022 • Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Lai, Yingyan Lin

We believe S$^3$-Router has provided a new perspective for practical deployment of speech SSL models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers

1 code implementation • 20 Apr 2022 • Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David Cox, Mark Hasegawa-Johnson, Shiyu Chang

Self-supervised learning in speech involves training a speech representation network on a large-scale unannotated speech corpus, and then applying the learned representations to downstream tasks.

Disentanglement Self-Supervised Learning

414

Paper
Code

Conditioned Natural Language Generation using only Unconditioned Language Model: An Exploration

no code implementations • 14 Nov 2020 • Fan-Keng Sun, Cheng-I Lai

Transformer-based language models have shown to be very powerful for natural language generation (NLG).

Attribute Language Modelling +1

Paper
Add Code

Towards Semi-Supervised Semantics Understanding from Speech

no code implementations • 11 Nov 2020 • Cheng-I Lai, Jin Cao, Sravan Bodapati, Shang-Wen Li

Much recent work on Spoken Language Understanding (SLU) falls short in at least one of three ways: models were trained on oracle text input and neglected the Automatics Speech Recognition (ASR) outputs, models were trained to predict only intents without the slot values, or models were trained on a large amount of in-house data.

speech-recognition Speech Recognition +1

Paper
Add Code

Semi-Supervised Spoken Language Understanding via Self-Supervised Speech and Language Model Pretraining

1 code implementation • 26 Oct 2020 • Cheng-I Lai, Yung-Sung Chuang, Hung-Yi Lee, Shang-Wen Li, James Glass

Much recent work on Spoken Language Understanding (SLU) is limited in at least one of three ways: models were trained on oracle text input and neglected ASR errors, models were trained to predict only intents without the slot values, or models were trained on a large amount of in-house data.

Language Modelling Spoken Language Understanding

Paper
Code

Can Speaker Augmentation Improve Multi-Speaker End-to-End TTS?

1 code implementation • 4 May 2020 • Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Junichi Yamagishi

This is followed by an analysis on synthesis quality, speaker and dialect similarity, and a remark on the effectiveness of our speaker augmentation approach.

Speech Synthesis

264

Paper
Code

Zero-Shot Multi-Speaker Text-To-Speech with State-of-the-art Neural Speaker Embeddings

3 code implementations • 23 Oct 2019 • Erica Cooper, Cheng-I Lai, Yusuke Yasuda, Fuming Fang, Xin Wang, Nanxin Chen, Junichi Yamagishi

While speaker adaptation for end-to-end speech synthesis using speaker embeddings can produce good speaker similarity for speakers seen during training, there remains a gap for zero-shot adaptation to unseen speakers.

Audio and Speech Processing

264

Paper
Code

Controlling the Reading Level of Machine Translation Output

no code implementations • WS 2019 • Kelly Marchisio, Jialiang Guo, Cheng-I Lai, Philipp Koehn

Machine Translation Translation

Paper
Add Code

Contrastive Predictive Coding Based Feature for Automatic Speaker Verification

1 code implementation • 1 Apr 2019 • Cheng-I Lai

This thesis describes our ongoing work on Contrastive Predictive Coding (CPC) features for speaker verification.

Representation Learning Speaker Verification

468

Paper
Code

ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks

1 code implementation • 1 Apr 2019 • Cheng-I Lai, Nanxin Chen, Jesús Villalba, Najim Dehak

We present JHU's system submission to the ASVspoof 2019 Challenge: Anti-Spoofing with Squeeze-Excitation and Residual neTworks (ASSERT).

Feature Engineering Voice Conversion

Paper
Code

Attentive Filtering Networks for Audio Replay Attack Detection

1 code implementation • 31 Oct 2018 • Cheng-I Lai, Alberto Abad, Korin Richmond, Junichi Yamagishi, Najim Dehak, Simon King

In this work, we propose our replay attacks detection system - Attentive Filtering Network, which is composed of an attention-based filtering mechanism that enhances feature representations in both the frequency and time domains, and a ResNet-based classifier.

Speaker Verification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.