1 code implementation • 8 Sep 2022 • Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
Connectionist temporal classification (CTC) -based models are attractive in automatic speech recognition (ASR) because of their non-autoregressive nature.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 5 Sep 2022 • Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
In this study, we propose to distill the knowledge of BERT for CTC-based ASR, extending our previous study for attention-based ASR.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 5 Oct 2021 • Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
We propose an ASR rescoring method for directly detecting errors with ELECTRA, which is originally a pre-training method for NLP tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 9 Aug 2020 • Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
Experimental evaluations show that our method significantly improves the ASR performance from the seq2seq baseline on the Corpus of Spontaneous Japanese (CSJ).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 19 May 2020 • Kohei Matsuura, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
We evaluated this speaker adaptation approach on two low-resource corpora, namely, Ainu and Mboshi.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • LREC 2020 • Kohei Matsuura, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
Ainu is an unwritten language that has been spoken by Ainu people who are one of the ethnic groups in Japan.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Sep 2019 • Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara
Moreover, the A2C model can be used to recover out-of-vocabulary (OOV) words that are not covered by the A2W model, but this requires accurate detection of OOV words.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1