no code implementations • 9 Oct 2023 • Utkarsh Oggy Sarawgi, John Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik
Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms.
no code implementations • 9 Sep 2023 • Pranay Dighe, Yi Su, Shangshang Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed Tewfik
While large language models excel in a variety of natural language processing (NLP) tasks, to perform well on spoken language understanding (SLU) tasks, they must either rely on off-the-shelf automatic speech recognition (ASR) systems for transcription, or be equipped with an in-built speech modality.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7
no code implementations • 21 Feb 2023 • Madhumitha Sakthi, Ahmed Tewfik, Marius Arvinte, Haris Vikalo
Automotive radar has increasingly attracted attention due to growing interest in autonomous driving technologies.
no code implementations • 21 Oct 2022 • Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed Tewfik
Accurate prediction of the user intent to interact with a voice assistant (VA) on a device (e. g. on the phone) is critical for achieving naturalistic, engaging, and privacy-centric interactions with the VA. To this end, we present a novel approach to predict the user's intent (the user speaking to the device or not) directly from acoustic and textual information encoded at subword tokens which are obtained via an end-to-end ASR model.
1 code implementation • 10 Jul 2022 • Yan Han, Gregory Holste, Ying Ding, Ahmed Tewfik, Yifan Peng, Zhangyang Wang
Using the learned self-attention of its image branch, RGT extracts a bounding box for which to compute radiomic features, which are further processed by the radiomics branch; learned image and radiomic features are then fused and mutually interact via cross-attention layers.
no code implementations • 5 Apr 2022 • Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik
A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task.
no code implementations • 30 Mar 2022 • Vineet Garg, Ognjen Rudovic, Pranay Dighe, Ahmed H. Abdelaziz, Erik Marchi, Saurabh Adya, Chandra Dhir, Ahmed Tewfik
We also show that the ensemble of the LatticeRNN and acoustic-distilled models brings further accuracy improvement of 20%.
no code implementations • 8 Mar 2022 • Madhumitha Sakthi, Ahmed Tewfik, Marius Arvinte, Haris Vikalo
We show robust detection based on radar data reconstructed using 20% of samples under extreme weather conditions such as snow or fog, and on low-illuminated nights.
no code implementations • 29 Sep 2021 • Yan Han, Ying Ding, Ahmed Tewfik, Yifan Peng, Zhangyang Wang
During training, the image branch leverages its learned attention to estimate pathology localization, which is then utilized to extract radiomic features from images in the radiomics branch.
no code implementations • 11 Apr 2021 • Yan Han, Chongyan Chen, Ahmed Tewfik, Benjamin Glicksberg, Ying Ding, Yifan Peng, Zhangyang Wang
The key knob of our framework is a unique positive sampling approach tailored for the medical images, by seamlessly integrating radiomic features as a knowledge augmentation.
no code implementations • 25 Nov 2020 • Yan Han, Chongyan Chen, Liyan Tang, Mingquan Lin, Ajay Jaiswal, Song Wang, Ahmed Tewfik, George Shih, Ying Ding, Yifan Peng
After a number of iterations and with the help of radiomic features, our framework can converge to more accurate image regions.
1 code implementation • 5 Oct 2020 • Madhumitha Sakthi, Ahmed Tewfik
In this paper, we introduce an algorithm to utilize object detection results from the image to adaptively sample and acquire radar data using Compressed Sensing (CS).
no code implementations • 28 Sep 2020 • Madhumitha Sakthi, Ahmed Tewfik
We use previous radar frame information to mitigate the potential information loss of an object missed by the image or the object detection network.
no code implementations • 1 Jun 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
In this paper we introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function that can generate more meaningful electroencephalography (EEG) features from raw EEG features to improve the performance of EEG based speech recognition systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 29 May 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
The electroencephalography (EEG) signals recorded in parallel with speech are used to perform isolated and continuous speech recognition.
no code implementations • 29 May 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
In this paper we demonstrate that it is possible to generate more meaningful electroencephalography (EEG) features from raw EEG features using generative adversarial networks (GAN) to improve the performance of EEG based continuous speech recognition systems.
no code implementations • 29 May 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
In [1, 2] authors provided preliminary results for synthesizing speech from electroencephalography (EEG) features where they first predict acoustic features from EEG features and then the speech is reconstructed from the predicted acoustic features using griffin lim reconstruction algorithm.
no code implementations • 16 May 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
In this paper we explore predicting facial or lip video features from electroencephalography (EEG) features and predicting EEG features from recorded facial or lip video frames using deep learning models.
no code implementations • 9 Apr 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
In this paper we introduce attention-regression model to demonstrate predicting acoustic features from electroencephalography (EEG) features recorded in parallel with spoken sentences.
no code implementations • 7 Mar 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
In this paper we explore speaker identification using electroencephalography (EEG) signals.
1 code implementation • 28 Feb 2020 • Marius Arvinte, Ahmed Tewfik, Sriram Vishwanath
We introduce an adversarial sample detection algorithm based on image residuals, specifically designed to guard against patch-based attacks.
no code implementations • 6 Feb 2020 • Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
Our results demonstrate the feasibility of using EEG signals for performing continuous silent speech recognition.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3