Keyword Spotting

96 papers with code • 10 benchmarks • 8 datasets

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )

Libraries

Use these libraries to find Keyword Spotting models and implementations

TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping

wilkinghoff/kws-dailytalk 18 May 2023

To segment a signal into blocks to be analyzed, few-shot keyword spotting (KWS) systems often utilize a sliding window of fixed size.

3
18 May 2023

Semi-Supervised Federated Learning for Keyword Spotting

diaoenmao/Semi-Supervised-Federated-Learing-for-Keyword-Spotting 9 May 2023

Keyword Spotting (KWS) is a critical aspect of audio-based applications on mobile devices and virtual assistants.

1
09 May 2023

Plug-and-Play Multilingual Few-shot Spoken Words Recognition

fewshotml/plix 3 May 2023

As technology advances and digital devices become prevalent, seamless human-machine communication is increasingly gaining significance.

14
03 May 2023

Unsupervised Speech Representation Pooling Using Vector Quantization

IIP-Sogang/speech-pooling-benchmark 8 Apr 2023

However, the pooling problem remains; the length of speech representations is inherently variable.

4
08 Apr 2023

AraSpot: Arabic Spoken Command Spotting

msalhab96/araspot 29 Mar 2023

Spoken keyword spotting (KWS) is the task of identifying a keyword in an audio stream and is widely used in smart devices at the edge in order to activate voice assistants and perform hands-free tasks.

2
29 Mar 2023

LipLearner: Customizable Silent Speech Interactions on Mobile Devices

rkmtlab/LipLearner 12 Feb 2023

Silent speech interface is a promising technology that enables private communications in natural language.

50
12 Feb 2023

ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification

sara-ahmed/asit 23 Nov 2022

Transformers, which were originally developed for natural language processing, have recently generated significant interest in the computer vision and audio communities due to their flexibility in learning long-range relationships.

17
23 Nov 2022

BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

htqin/bifsmnv2 13 Nov 2022

We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25. 1x speedup and 20. 2x storage-saving on edge hardware.

24
13 Nov 2022

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

wngh1187/ipet 4 Nov 2022

To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain.

9
04 Nov 2022

MAST: Multiscale Audio Spectrogram Transformers

Sreyan88/LAPE 2 Nov 2022

We present Multiscale Audio Spectrogram Transformer (MAST) for audio classification, which brings the concept of multiscale feature hierarchies to the Audio Spectrogram Transformer (AST).

22
02 Nov 2022