Search Results for author: Hong-Kwang J. Kuo

Found 11 papers, 1 papers with code

Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems

no code implementations • 11 Apr 2022 • Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury

Recent advances in End-to-End (E2E) Spoken Language Understanding (SLU) have been primarily due to effective pretraining of speech representations.

Intent Recognition Spoken Language Understanding

Paper
Add Code

Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding

no code implementations • 11 Apr 2022 • Vishal Sunder, Samuel Thomas, Hong-Kwang J. Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier

In the absence of gold transcripts to fine-tune an ASR model, our model outperforms this baseline by a significant margin of 10% absolute F1 score.

Action Recognition Spoken Language Understanding

Paper
Add Code

Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems

no code implementations • 26 Feb 2022 • Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury, George Saon

In this paper, we propose a novel text representation and training methodology that allows E2E SLU systems to be effectively constructed using these text resources.

Spoken Language Understanding

Paper
Add Code

Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models

no code implementations • 26 Feb 2022 • Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang J. Kuo

We observe 20-45% relative word error rate (WER) reduction in these settings with this novel LM style customization technique using only unpaired text data from the new domains.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Improving End-to-End Models for Set Prediction in Spoken Language Understanding

no code implementations • 28 Jan 2022 • Hong-Kwang J. Kuo, Zoltan Tuske, Samuel Thomas, Brian Kingsbury, George Saon

The goal of spoken language understanding (SLU) systems is to determine the meaning of the input speech signal, unlike speech recognition which aims to produce verbatim transcripts.

Data Augmentation speech-recognition +2

Paper
Add Code

Integrating Dialog History into End-to-End Spoken Language Understanding Systems

no code implementations • 18 Aug 2021 • Jatin Ganhotra, Samuel Thomas, Hong-Kwang J. Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury

End-to-end spoken language understanding (SLU) systems that process human-human or human-computer interactions are often context independent and process each turn of a conversation independently.

Intent Recognition Spoken Language Understanding

Paper
Add Code

RNN Transducer Models For Spoken Language Understanding

1 code implementation • 8 Apr 2021 • Samuel Thomas, Hong-Kwang J. Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory

We present a comprehensive study on building and adapting RNN transducer (RNN-T) models for spoken language understanding(SLU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features

no code implementations • 16 Nov 2020 • Edmilson Morais, Hong-Kwang J. Kuo, Samuel Thomas, Zoltan Tuske, Brian Kingsbury

Transformer networks and self-supervised pre-training have consistently delivered state-of-art results in the field of natural language processing (NLP); however, their merits in the field of spoken language understanding (SLU) still need further investigation.

Spoken Language Understanding

Paper
Add Code

End-to-End Spoken Language Understanding Without Full Transcripts

no code implementations • 30 Sep 2020 • Hong-Kwang J. Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras

For our speech-to-entities experiments on the ATIS corpus, both the CTC and attention models showed impressive ability to skip non-entity words: there was little degradation when trained on just entities versus full transcripts.

slot-filling Slot Filling +3

Paper
Add Code

The IBM 2016 English Conversational Telephone Speech Recognition System

no code implementations • 27 Apr 2016 • George Saon, Tom Sercu, Steven Rennie, Hong-Kwang J. Kuo

We describe a collection of acoustic and language modeling techniques that lowered the word error rate of our English conversational telephone LVCSR system to a record 6. 6% on the Switchboard subset of the Hub5 2000 evaluation testset.

Ranked #5 on Speech Recognition on swb_hub_500 WER fullSWBCH

Language Modelling speech-recognition +1

Paper
Add Code

The IBM 2015 English Conversational Telephone Speech Recognition System

no code implementations • 21 May 2015 • George Saon, Hong-Kwang J. Kuo, Steven Rennie, Michael Picheny

We describe the latest improvements to the IBM English conversational telephone speech recognition system.

Ranked #11 on Speech Recognition on Switchboard + Hub500

Language Modelling speech-recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.