Search Results for author: Gakuto Kurata

Found 11 papers, 1 papers with code

Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems

no code implementations • 7 Sep 2023 • Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Masayasu Muraoka, George Saon

However, existing works only transfer a single representation of LLM (e. g. the last layer of pretrained BERT), while the representation of a text is inherently non-unique and can be obtained variously from different layers, contexts and models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems

no code implementations • 1 Apr 2022 • Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Nobuyasu Itoh, George Saon

Large-scale language models (LLMs) such as GPT-2, BERT and RoBERTa have been successfully applied to ASR N-best rescoring.

Language Modelling Lexical Analysis

Paper
Add Code

Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing

no code implementations • 29 Mar 2022 • Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata

We introduce two techniques, length perturbation and n-best based label smoothing, to improve generalization of deep neural network (DNN) acoustic models for automatic speech recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

RNN Transducer Models For Spoken Language Understanding

1 code implementation • 8 Apr 2021 • Samuel Thomas, Hong-Kwang J. Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory

We present a comprehensive study on building and adapting RNN transducer (RNN-T) models for spoken language understanding(SLU).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

End-to-End Spoken Language Understanding Without Full Transcripts

no code implementations • 30 Sep 2020 • Hong-Kwang J. Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras

For our speech-to-entities experiments on the ATIS corpus, both the CTC and attention models showed impressive ability to skip non-entity words: there was little degradation when trained on just entities versus full transcripts.

slot-filling Slot Filling +3

Paper
Add Code

English Broadcast News Speech Recognition by Humans and Machines

no code implementations • 30 Apr 2019 • Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltan Tuske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko

With recent advances in deep learning, considerable attention has been given to achieving automatic speech recognition performance close to human performance on tasks like conversational telephone speech (CTS) recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation

no code implementations • 17 Apr 2019 • Gakuto Kurata, Kartik Audhkhasi

Conventional automatic speech recognition (ASR) systems trained from frame-level alignments can easily leverage posterior fusion to improve ASR accuracy and build a better single model with knowledge distillation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4