Search Results for author: Kshitiz Kumar

Found 5 papers, 0 papers with code

Bilingual Streaming ASR with Grapheme units and Auxiliary Monolingual Loss

no code implementations • 11 Aug 2023 • Mohammad Soleymanpour, Mahmoud Al Ismail, Fahimeh Bahmaninezhad, Kshitiz Kumar, Jian Wu

Our key developments constitute: (a) pronunciation lexicon with grapheme units instead of phone units, (b) a fully bilingual alignment model and subsequently bilingual streaming transformer model, (c) a parallel encoder structure with language identification (LID) loss, (d) parallel encoder with an auxiliary loss for monolingual projections.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, And Pretraining: An Ablation Study

no code implementations • 7 Feb 2022 • Daniel Tompkins, Kshitiz Kumar, Jian Wu

An Xception model reaches state-of-the-art (SOTA) accuracy on the ESC-50 dataset for audio event detection through knowledge transfer from ImageNet weights, pretraining on AudioSet, and an on-the-fly data augmentation pipeline.

Data Augmentation Event Detection +1

Paper
Add Code

Sequence-level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models

no code implementations • 30 Jun 2021 • Amber Afshan, Kshitiz Kumar, Jian Wu

We propose a cost-effective method of using CC scores to select an optimal adaptation data set, where we maximize ASR gains from minimal data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Transfer Learning Approaches for Streaming End-to-End Speech Recognition System

no code implementations • 12 Aug 2020 • Vikas Joshi, Rui Zhao, Rupesh R. Mehta, Kshitiz Kumar, Jinyu Li

Transfer learning (TL) is widely used in conventional hybrid automatic speech recognition (ASR) system, to transfer the knowledge from source to target language.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Speaker Adaptation for End-to-End CTC Models

no code implementations • 4 Jan 2019 • Ke Li, Jinyu Li, Yong Zhao, Kshitiz Kumar, Yifan Gong

We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech recognition systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.