Keyword Spotting

96 papers with code • 10 benchmarks • 8 datasets

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )

Benchmarks

Add a Result

These leaderboards are used to track progress in Keyword Spotting

Dataset	Best Model	Compare
QUESST	ELiRF Fusion+Length(All Queries)	See all
Google Speech Commands	TripletLoss-res15	See all
hey Siri	HEiMDaL	See all
TensorFlow	TensorFlow's model version 2	See all
TAU Urban Acoustic Scenes 2019	CP-ResNet(ch64) w/ SSN(S=2, A=Sub)	See all
VoxForge	1D-ConvNet	See all
FKD	Res26	See all
Google Speech Commands V2 12	MicroNet-KWS-L	See all
Google Speech Commands V2 35	QuaternionNeuralNetwork	See all
Google Speech Commands (v2)	Quaternion Neural Networks	See all

Libraries

Use these libraries to find Keyword Spotting models and implementations

PaddlePaddle/PaddleSpeech

2 papers

10,171

holgerbovbjerg/data2vec-kws

2 papers

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping

wilkinghoff/kws-dailytalk • 18 May 2023

To segment a signal into blocks to be analyzed, few-shot keyword spotting (KWS) systems often utilize a sliding window of fixed size.

18 May 2023

Paper
Code

Semi-Supervised Federated Learning for Keyword Spotting

diaoenmao/Semi-Supervised-Federated-Learing-for-Keyword-Spotting • • 9 May 2023

Keyword Spotting (KWS) is a critical aspect of audio-based applications on mobile devices and virtual assistants.

09 May 2023

Paper
Code

Plug-and-Play Multilingual Few-shot Spoken Words Recognition

fewshotml/plix • • 3 May 2023

As technology advances and digital devices become prevalent, seamless human-machine communication is increasingly gaining significance.

03 May 2023

Paper
Code

Unsupervised Speech Representation Pooling Using Vector Quantization

IIP-Sogang/speech-pooling-benchmark • • 8 Apr 2023

However, the pooling problem remains; the length of speech representations is inherently variable.

08 Apr 2023

Paper
Code

AraSpot: Arabic Spoken Command Spotting

msalhab96/araspot • • 29 Mar 2023

Spoken keyword spotting (KWS) is the task of identifying a keyword in an audio stream and is widely used in smart devices at the edge in order to activate voice assistants and perform hands-free tasks.

29 Mar 2023

Paper
Code

LipLearner: Customizable Silent Speech Interactions on Mobile Devices

rkmtlab/LipLearner • • 12 Feb 2023

Silent speech interface is a promising technology that enables private communications in natural language.

12 Feb 2023

Paper
Code

ASiT: Local-Global Audio Spectrogram vIsion Transformer for Event Classification

sara-ahmed/asit • • 23 Nov 2022

Transformers, which were originally developed for natural language processing, have recently generated significant interest in the computer vision and audio communities due to their flexibility in learning long-range relationships.

23 Nov 2022

Paper
Code

BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

htqin/bifsmnv2 • • 13 Nov 2022

We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25. 1x speedup and 20. 2x storage-saving on edge hardware.

13 Nov 2022

Paper
Code

Integrated Parameter-Efficient Tuning for General-Purpose Audio Models

wngh1187/ipet • • 4 Nov 2022

To overcome these limitations, the present study explores and applies efficient transfer learning methods in the audio domain.

04 Nov 2022

Paper
Code

MAST: Multiscale Audio Spectrogram Transformers

Sreyan88/LAPE • • 2 Nov 2022

We present Multiscale Audio Spectrogram Transformer (MAST) for audio classification, which brings the concept of multiscale feature hierarchies to the Audio Spectrogram Transformer (AST).

02 Nov 2022

Paper
Code

Keyword Spotting

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result