In speech processing, keyword spotting deals with the identification of keywords in utterances.
( Image credit: Simon Grest )
We explore the application of end-to-end stateless temporal modeling to small-footprint keyword spotting as opposed to recurrent networks that model long-term temporal dependencies using internal states.
We train various neural network architectures for keyword spotting published in literature to compare their accuracy and memory/compute requirements.
Ranked #6 on
Keyword Spotting
on Google Speech Commands
(Google Speech Commands V1 12 metric)
We explore the application of deep residual learning and dilated convolutions to the keyword spotting task, using the recently-released Google Speech Commands Dataset as our benchmark.
We describe Honk, an open-source PyTorch reimplementation of convolutional neural networks for keyword spotting that are included as examples in TensorFlow.
In this paper we explore a TM based keyword spotting (KWS) pipeline to demonstrate low complexity with faster rate of convergence compared to NNs.
KEYWORD SPOTTING AUDIO AND SPEECH PROCESSING SOUND
Well established text line segmentation evaluation schemes such as the Detection Rate or Recognition Accuracy demand for binarized data that is annotated on a pixel level.
We propose a single neural network architecture for two tasks: on-line keyword spotting and voice activity detection.
In addition, we release the implementation of the proposed and the baseline models including an end-to-end pipeline for training models and evaluating them on mobile devices.
Ranked #7 on
Keyword Spotting
on Google Speech Commands
The problem of keyword spotting i. e. identifying keywords in a real-time audio stream is mainly solved by applying a neural network over successive sliding windows.
Describes an audio dataset of spoken words designed to help train and evaluate keyword spotting systems.
Ranked #1 on
Keyword Spotting
on TensorFlow