no code implementations • 20 Jun 2023 • Woojay Jeon
In this work, we describe a novel method of training an embedding-matching word-level connectionist temporal classification (CTC) automatic speech recognizer (ASR) such that it directly produces word start times and durations, required by many real-world applications, in addition to the transcription.
no code implementations • 30 Oct 2022 • Hao Yen, Woojay Jeon
In embedding-matching acoustic-to-word (A2W) ASR, every word in the vocabulary is represented by a fixed-dimension embedding vector that can be added or removed independently of the rest of the system.
no code implementations • 20 Jul 2020 • Woojay Jeon
Two encoder neural networks are trained: an acoustic encoder that accepts speech signals in the form of frame-wise subword posterior probabilities obtained from an acoustic model and a text encoder that accepts text in the form of subword transcriptions.
no code implementations • 29 Feb 2020 • Woojay Jeon, Leo Liu, Henry Mason
We propose a method to reduce false voice triggers of a speech-enabled personal assistant by post-processing the hypothesis lattice of a server-side large-vocabulary continuous speech recognizer (LVCSR) via a neural network.
no code implementations • 22 Jul 2019 • Woojay Jeon, Maxwell Jordan, Mahesh Krishnamoorthy
We present a new method for computing ASR word confidences that effectively mitigates the effect of ASR errors for diverse downstream applications, improves the word error rate of the 1-best result, and allows better comparison of scores across different models.