Search Results for author: Prashanth Gurunath Shivakumar

Found 15 papers, 3 papers with code

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

no code implementations5 Jan 2024 Kevin Everson, Yile Gu, Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke

In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text.

In-Context Learning intent-classification +6

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

no code implementations23 Dec 2023 Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-Yi Lee, Ivan Bulyko

Specifically, our framework serializes tasks in the order of current paralinguistic attribute prediction, response paralinguistic attribute prediction, and response text generation with autoregressive conditioning.

Attribute Language Modelling +4

Personalization for BERT-based Discriminative Speech Recognition Rescoring

no code implementations13 Jul 2023 Jari Kolehmainen, Yile Gu, Aditya Gourav, Prashanth Gurunath Shivakumar, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

On a test set with personalized named entities, we show that each of these approaches improves word error rate by over 10%, against a neural rescoring baseline.

speech-recognition Speech Recognition

Scaling Laws for Discriminative Speech Recognition Rescoring Models

no code implementations27 Jun 2023 Yile Gu, Prashanth Gurunath Shivakumar, Jari Kolehmainen, Ankur Gandhe, Ariya Rastrow, Ivan Bulyko

We study whether this scaling property is also applicable to second-pass rescoring, which is an important component of speech recognition systems.

speech-recognition Speech Recognition

Phone Duration Modeling for Speaker Age Estimation in Children

no code implementations3 Sep 2021 Prashanth Gurunath Shivakumar, Somer Bishop, Catherine Lord, Shrikanth Narayanan

In this paper, we propose features specific to children and focus on speaker's phone duration as an important biomarker of children's age.

Age Estimation regression

End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study

no code implementations19 Feb 2021 Prashanth Gurunath Shivakumar, Shrikanth Narayanan

A key desiderata for inclusive and accessible speech recognition technology is ensuring its robust performance to children's speech.

speech-recognition Speech Recognition

Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords

1 code implementation3 Feb 2021 Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan

Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

RNN based Incremental Online Spoken Language Understanding

no code implementations23 Oct 2019 Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan

We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +8

Behavior Gated Language Models

no code implementations31 Aug 2019 Prashanth Gurunath Shivakumar, Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

In this work we derive motivation from psycholinguistics and propose the addition of behavioral information into the context of language modeling.

Language Modelling

Spoken Language Intent Detection using Confusion2Vec

1 code implementation7 Apr 2019 Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities

no code implementations8 Nov 2018 Prashanth Gurunath Shivakumar, Panayiotis Georgiou

In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and perception that encodes representational ambiguity.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations

no code implementations8 May 2018 Prashanth Gurunath Shivakumar, Panayiotis Georgiou

Evaluations are presented on (i) comparisons of earlier GMM-HMM and the newer DNN Models, (ii) effectiveness of standard adaptation techniques versus transfer learning, (iii) various adaptation configurations in tackling the variabilities present in children speech, in terms of (a) acoustic spectral variability, and (b) pronunciation variability and linguistic constraints.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling

1 code implementation7 Feb 2018 Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou

In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can learn from the aggregate errors of all the independent modules constituting the ASR and attempt to invert those.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.