Search Results for author: Devang Naik

Found 15 papers, 0 papers with code

Weight subcloning: direct initialization of transformers using larger pretrained ones

no code implementations14 Dec 2023 Mohammad Samragh, Mehrdad Farajtabar, Sachin Mehta, Raviteja Vemulapalli, Fartash Faghri, Devang Naik, Oncel Tuzel, Mohammad Rastegari

The usual practice of transfer learning overcomes this challenge by initializing the model with weights of a pretrained model of the same size and specification to increase the convergence and training speed.

Image Classification Transfer Learning

eDKM: An Efficient and Accurate Train-time Weight Clustering for Large Language Models

no code implementations2 Sep 2023 Minsik Cho, Keivan A. Vahid, Qichen Fu, Saurabh Adya, Carlo C Del Mundo, Mohammad Rastegari, Devang Naik, Peter Zatloukal

Since Large Language Models or LLMs have demonstrated high-quality performance on many complex language tasks, there is a great interest in bringing these LLMs to mobile devices for faster responses and better privacy protection.

Clustering Quantization

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder

no code implementations31 Aug 2023 Alexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik

Using a vision-inspired keyword spotting framework, we propose an architecture with input-dependent dynamic depth capable of processing streaming audio.

Keyword Spotting

Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

no code implementations12 Aug 2023 Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik

Spotting user-defined/flexible keywords represented in text frequently uses an expensive text encoder for joint analysis with an audio encoder in an embedding space, which can suffer from heterogeneous modality representation (i. e., large mismatch) and increased complexity.

Keyword Spotting

Matching Latent Encoding for Audio-Text based Keyword Spotting

no code implementations8 Jun 2023 Kumari Nishu, Minsik Cho, Devang Naik

Using audio and text embeddings jointly for Keyword Spotting (KWS) has shown high-quality results, but the key challenge of how to semantically align two embeddings for multi-word keywords of different sequence lengths remains largely unsolved.

Keyword Spotting

Optimize what matters: Training DNN-HMM Keyword Spotting Model Using End Metric

no code implementations2 Nov 2020 Ashish Shrivastava, Arnav Kundu, Chandra Dhir, Devang Naik, Oncel Tuzel

The DNN, in prior methods, is trained independent of the HMM parameters to minimize the cross-entropy loss between the predicted and the ground-truth state probabilities.

Keyword Spotting

Complementary Language Model and Parallel Bi-LRNN for False Trigger Mitigation

no code implementations18 Aug 2020 Rishika Agarwal, Xiaochuan Niu, Pranay Dighe, Srikanth Vishnubhotla, Sameer Badaskar, Devang Naik

In this paper, we propose a novel solution to the FTM problem by introducing a parallel ASR decoding process with a special language model trained from "out-of-domain" data sources.

Language Modelling

Multi-task Learning for Speaker Verification and Voice Trigger Detection

no code implementations26 Jan 2020 Siddharth Sigtia, Erik Marchi, Sachin Kajarekar, Devang Naik, John Bridle

We train the network in a supervised multi-task learning setup, where the speech transcription branch of the network is trained to minimise a phonetic connectionist temporal classification (CTC) loss while the speaker recognition branch of the network is trained to label the input sequence with the correct label for the speaker.

Multi-Task Learning Speaker Recognition +1

Lattice-based Improvements for Voice Triggering Using Graph Neural Networks

no code implementations25 Jan 2020 Pranay Dighe, Saurabh Adya, Nuoyu Li, Srikanth Vishnubhotla, Devang Naik, Adithya Sagar, Ying Ma, Stephen Pulman, Jason Williams

A pure trigger-phrase detector model doesn't fully utilize the intent of the user speech whereas by using the complete decoding lattice of user audio, we can effectively mitigate speech not intended for the smart assistant.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.