1 code implementation • 24 Feb 2024 • Soyoung Yoon, Eunbi Choi, Jiyeon Kim, Yireun Kim, Hyeongu Yun, Seung-won Hwang
We propose ListT5, a novel reranking approach based on Fusion-in-Decoder (FiD) that handles multiple candidate passages at both train and inference time.
no code implementations • 19 Jan 2024 • Abhinav Garg, Jiyeon Kim, Sushil Khyalia, Chanwoo Kim, Dhananjaya Gowda
Grapheme-to-Phoneme (G2P) is an essential first step in any modern, high-quality Text-to-Speech (TTS) system.
no code implementations • 14 Nov 2023 • Jaeik Jeon, Jiyeon Kim, Yeonggul Jang, Yeonyee E. Yoon, Dawun Jeong, Youngtaek Hong, Seung-Ah Lee, Hyuk-Jae Chang
Doppler echocardiography offers critical insights into cardiac function and phases by quantifying blood flow velocities and evaluating myocardial motion.
no code implementations • 31 Aug 2023 • Jaeik Jeon, Seongmin Ha, Yeonggul Jang, Yeonyee E. Yoon, Jiyeon Kim, Hyunseok Jeong, Dawun Jeong, Youngtaek Hong, Seung-Ah Lee Hyuk-Jae Chang
In echocardiographic view classification, accurately detecting out-of-distribution (OOD) data is essential but challenging, especially given the subtle differences between in-distribution and OOD data.
no code implementations • 19 Nov 2021 • Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim
To improve the accuracy of a low-resource Italian ASR, we leverage a well-trained English model, unlabeled text corpus, and unlabeled audio corpus using transfer learning, TTS augmentation, and SSL respectively.
no code implementations • 19 Nov 2021 • Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim
However, we observe that training of MoChA models seems to be more sensitive to various factors such as the characteristics of training sets and the incorporation of additional augmentations techniques.
no code implementations • 14 Dec 2020 • Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, Changwoo Han
Conventional speech recognition systems comprise a large number of discrete components such as an acoustic model, a language model, a pronunciation model, a text-normalizer, an inverse-text normalizer, a decoder based on a Weighted Finite State Transducer (WFST), and so on.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 22 Dec 2019 • Chanwoo Kim, Sungsoo Kim, Kwangyoun Kim, Mehul Kumar, Jiyeon Kim, Kyungmin Lee, Changwoo Han, Abhinav Garg, Eunhyang Kim, Minkyoo Shin, Shatrughan Singh, Larry Heck, Dhananjaya Gowda
Our end-to-end speech recognition system built using this training infrastructure showed a 2. 44 % WER on test-clean of the LibriSpeech test set after applying shallow fusion with a Transformer language model (LM).
no code implementations • 10 Aug 2019 • Dohyun Kim, Kyeorye Lee, Jiyeon Kim, Junseok Kwon, Joongheon Kim
The average accuracy is one of major evaluation metrics for classification systems, while the accuracy deviation is another important performance metric used to evaluate various deep neural networks.