no code implementations • 11 Aug 2023 • Mohammad Soleymanpour, Mahmoud Al Ismail, Fahimeh Bahmaninezhad, Kshitiz Kumar, Jian Wu
Our key developments constitute: (a) pronunciation lexicon with grapheme units instead of phone units, (b) a fully bilingual alignment model and subsequently bilingual streaming transformer model, (c) a parallel encoder structure with language identification (LID) loss, (d) parallel encoder with an auxiliary loss for monolingual projections.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 7 Feb 2022 • Daniel Tompkins, Kshitiz Kumar, Jian Wu
An Xception model reaches state-of-the-art (SOTA) accuracy on the ESC-50 dataset for audio event detection through knowledge transfer from ImageNet weights, pretraining on AudioSet, and an on-the-fly data augmentation pipeline.
no code implementations • 30 Jun 2021 • Amber Afshan, Kshitiz Kumar, Jian Wu
We propose a cost-effective method of using CC scores to select an optimal adaptation data set, where we maximize ASR gains from minimal data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 12 Aug 2020 • Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, Jinyu Li
Transfer learning (TL) is widely used in conventional hybrid automatic speech recognition (ASR) system, to transfer the knowledge from source to target language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 4 Jan 2019 • Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong
We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech recognition systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2