Keyword Spotting
96 papers with code • 10 benchmarks • 8 datasets
In speech processing, keyword spotting deals with the identification of keywords in utterances.
( Image credit: Simon Grest )
Libraries
Use these libraries to find Keyword Spotting models and implementationsDatasets
Latest papers
TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping
To segment a signal into blocks to be analyzed, few-shot keyword spotting (KWS) systems often utilize a sliding window of fixed size.
Semi-Supervised Federated Learning for Keyword Spotting
Keyword Spotting (KWS) is a critical aspect of audio-based applications on mobile devices and virtual assistants.
Plug-and-Play Multilingual Few-shot Spoken Words Recognition
As technology advances and digital devices become prevalent, seamless human-machine communication is increasingly gaining significance.
Unsupervised Speech Representation Pooling Using Vector Quantization
However, the pooling problem remains; the length of speech representations is inherently variable.
AraSpot: Arabic Spoken Command Spotting
Spoken keyword spotting (KWS) is the task of identifying a keyword in an audio stream and is widely used in smart devices at the edge in order to activate voice assistants and perform hands-free tasks.
LipLearner: Customizable Silent Speech Interactions on Mobile Devices
Silent speech interface is a promising technology that enables private communications in natural language.
ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification
Transformers, which were originally developed for natural language processing, have recently generated significant interest in the computer vision and audio communities due to their flexibility in learning long-range relationships.
BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25. 1x speedup and 20. 2x storage-saving on edge hardware.
Integrated Parameter-Efficient Tuning for General-Purpose Audio Models
To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain.
MAST: Multiscale Audio Spectrogram Transformers
We present Multiscale Audio Spectrogram Transformer (MAST) for audio classification, which brings the concept of multiscale feature hierarchies to the Audio Spectrogram Transformer (AST).