Search Results for author: Masato Mimura

Found 15 papers, 6 papers with code

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM

1 code implementation8 Sep 2022 Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Connectionist temporal classification (CTC) -based models are attractive in automatic speech recognition (ASR) because of their non-autoregressive nature.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Distilling the Knowledge of BERT for CTC-based ASR

no code implementations5 Sep 2022 Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

In this study, we propose to distill the knowledge of BERT for CTC-based ASR, extending our previous study for attention-based ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

ASR Rescoring and Confidence Estimation with ELECTRA

no code implementations5 Oct 2021 Hayato Futami, Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

We propose an ASR rescoring method for directly detecting errors with ELECTRA, which is originally a pre-training method for NLP tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Commuting symplectomorphisms on a surface and the flux homomorphism

no code implementations24 Feb 2021 Morimichi Kawasaki, Mitsuaki Kimura, Takahiro Matsushita, Masato Mimura

Let $(S,\omega)$ be a closed connected oriented surface whose genus $l$ is at least two equipped with a symplectic form.

Symplectic Geometry Group Theory Geometric Topology Primary 20F12, 20J05, 37E35, 53D35, 70H15, Secondary 20F36, 37A15, 37J05, 37J10, 57R17, 53D22

Constellations in prime elements of number fields

no code implementations31 Dec 2020 Wataru Kai, Masato Mimura, Akihiro Munemasa, Shin-ichiro Seki, Kiyoto Yoshino

Given any number field, we prove that there exist arbitrarily shaped constellations consisting of pairwise non-associate prime elements of the ring of integers.

Number Theory Combinatorics 11B30 (Primary) 11B25, 11H55, 11N05, 11R04, 05C55 (Secondary)

End-to-end Music-mixed Speech Recognition

1 code implementation27 Aug 2020 Jeongwoo Woo, Masato Mimura, Kazuyoshi Yoshii, Tatsuya Kawahara

The time-domain separation method outperformed a frequency-domain separation method, which reuses the phase information of the input mixture signal, both in simple cascading and joint training settings.

Audio and Speech Processing

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR

1 code implementation9 Aug 2020 Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Experimental evaluations show that our method significantly improves the ASR performance from the seq2seq baseline on the Corpus of Spontaneous Japanese (CSJ).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Bavard's duality theorem for mixed commutator length

no code implementations5 Jul 2020 Morimichi Kawasaki, Mitsuaki Kimura, Takahiro Matsushita, Masato Mimura

The goal in this paper is to establish Bavard's duality theorem of $G$-invariant quasimorphisms, which was previously proved by Kawasaki and Kimura in the case $N = [G, N]$.

Group Theory Algebraic Topology Geometric Topology

Enhancing Monotonic Multihead Attention for Streaming ASR

1 code implementation19 May 2020 Hirofumi Inaguma, Masato Mimura, Tatsuya Kawahara

For streaming inference, all monotonic attention (MA) heads should learn proper alignments because the next token is not generated until all heads detect the corresponding token boundaries.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

CTC-synchronous Training for Monotonic Attention Model

1 code implementation10 May 2020 Hirofumi Inaguma, Masato Mimura, Tatsuya Kawahara

Monotonic chunkwise attention (MoChA) has been studied for the online streaming automatic speech recognition (ASR) based on a sequence-to-sequence framework.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR

no code implementations22 Sep 2019 Hirofumi Inaguma, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara

Moreover, the A2C model can be used to recover out-of-vocabulary (OOV) words that are not covered by the A2W model, but this requires accurate detection of OOV words.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization

no code implementations31 Oct 2017 Yoshiaki Bando, Masato Mimura, Katsutoshi Itoyama, Kazuyoshi Yoshii, Tatsuya Kawahara

This paper presents a statistical method of single-channel speech enhancement that uses a variational autoencoder (VAE) as a prior distribution on clean speech.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.