no code implementations • LREC 2022 • Tomoki Kitagawa, Chee Siang Leow, Hiromitsu Nishizaki
This paper introduces a Y-Autoencoder (Y-AE)-based handwritten character generator to generate multiple Japanese Hiragana characters with a single image to increase the amount of data for training a handwritten character recognizer.
Optical Character Recognition Optical Character Recognition (OCR)
no code implementations • 29 Mar 2022 • Akihiro Dobashi, Chee Siang Leow, Hiromitsu Nishizaki
Furthermore, visualization of the attention weights based on the proposed method suggested that it is possible to transform acoustic features considering the frequency characteristics of each language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 7 Oct 2021 • Hayato Endo, Hiromitsu Nishizaki
This paper describes that semi-supervised learning called peer collaborative learning (PCL) can be applied to the polyphonic sound event detection (PSED) task, which is one of the tasks in the Detection and Classification of Acoustic Scenes and Events (DCASE) challenge.
1 code implementation • 3 Apr 2021 • Yu Wang, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki
This paper describes the ExKaldi-RT online automatic speech recognition (ASR) toolkit that is implemented based on the Kaldi ASR toolkit and Python language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • LREC 2020 • Jiajun Xu, Kyosuke Masuda, Hiromitsu Nishizaki, Fumiyo Fukumoto, Yoshimi Suzuki
Therefore, this paper proposes a method of creating a semi-automatically constructed emotion corpus.
no code implementations • LREC 2020 • Huaijin Deng, Youchao Lin, Takehito Utsuro, Akio Kobayashi, Hiromitsu Nishizaki, Junichi Hoshino
The experimental evaluation results of those integrated diverse features indicate that time sequential acoustic features contribute to improving the model with disfluency-based and prosodic features when detecting fluent speech, but not when detecting disfluent speech.
no code implementations • LREC 2020 • Meiko Fukuda, Hiromitsu Nishizaki, Yurie Iribe, Ryota Nishimura, Norihide Kitaoka
In an aging society like Japan, a highly accurate speech recognition system is needed for use in electronic devices for the elderly, but this level of accuracy cannot be obtained using conventional speech recognition systems due to the unique features of the speech of elderly people.
no code implementations • 8 Apr 2019 • Masaki Okawa, Takuya Saito, Naoki Sawada, Hiromitsu Nishizaki
In our experiment, we compare the proposed bit representation waveform, which is directly given to a neural network, to other representations of audio waveforms such as a raw audio waveform and a power spectrum with two classification tasks: one is an acoustic event classification task and the other is a sound/music classification task.
no code implementations • LREC 2012 • Tomoyosi Akiba, Hiromitsu Nishizaki, Kiyoaki Aikawa, Tatsuya Kawahara, Tomoko Matsui
We describe the evaluation framework for spoken document retrieval for the IR for the Spoken Documents Task, conducted in the ninth NTCIR Workshop.