Keyword Spotting

96 papers with code • 10 benchmarks • 8 datasets

In speech processing, keyword spotting deals with the identification of keywords in utterances.

( Image credit: Simon Grest )

Benchmarks

Add a Result

These leaderboards are used to track progress in Keyword Spotting

Dataset	Best Model	Compare
QUESST	ELiRF Fusion+Length(All Queries)	See all
Google Speech Commands	TripletLoss-res15	See all
hey Siri	HEiMDaL	See all
TensorFlow	TensorFlow's model version 2	See all
TAU Urban Acoustic Scenes 2019	CP-ResNet(ch64) w/ SSN(S=2, A=Sub)	See all
VoxForge	1D-ConvNet	See all
FKD	Res26	See all
Google Speech Commands V2 12	MicroNet-KWS-L	See all
Google Speech Commands V2 35	QuaternionNeuralNetwork	See all
Google Speech Commands (v2)	Quaternion Neural Networks	See all

Libraries

Use these libraries to find Keyword Spotting models and implementations

PaddlePaddle/PaddleSpeech

2 papers

10,177

holgerbovbjerg/data2vec-kws

2 papers

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Keyword spotting -- Detecting commands in speech using deep learning

no code yet • 9 Dec 2023

Speech recognition has become an important task in the development of machine learning and artificial intelligence.

Paper
Add Code

Personalizing Keyword Spotting with Speaker Information

no code yet • 6 Nov 2023

Keyword spotting systems often struggle to generalize to a diverse population with various accents and age groups.

Paper
Add Code

ed-cec: improving rare word recognition using asr postprocessing based on error detection and context-aware error correction

no code yet • 8 Oct 2023

Automatic speech recognition (ASR) systems often encounter difficulties in accurately recognizing rare words, leading to errors that can have a negative impact on downstream tasks such as keyword spotting, intent detection, and text summarization.

Paper
Add Code

Does Single-channel Speech Enhancement Improve Keyword Spotting Accuracy? A Case Study

no code yet • 27 Sep 2023

Our investigation reveals that SE can improve KWS accuracy on noisy speech when the backend model is trained on clean speech; however, despite our extensive exploration, it is difficult to improve the KWS accuracy with SE when the backend is trained on noisy speech.

Paper
Add Code

On the Non-Associativity of Analog Computations

no code yet • 25 Sep 2023

With this model we assess the importance of ordering by comparing the test accuracy of a neural network for keyword spotting, which is trained based either on an ordered model, on a non-ordered variant, and on real hardware.

Paper
Add Code

VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks

no code yet • 22 Sep 2023

Keyword spotting (KWS) refers to the task of identifying a set of predefined words in audio streams.

Paper
Add Code

A Multitask Training Approach to Enhance Whisper with Contextual Biasing and Open-Vocabulary Keyword Spotting

no code yet • 18 Sep 2023

End-to-end automatic speech recognition (ASR) systems often struggle to recognize rare name entities, such as personal names, organizations, and terminologies not frequently encountered in the training data.

Paper
Add Code

Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks

no code yet • 18 Sep 2023

Brain-inspired spiking neural networks (SNNs) have demonstrated great potential for temporal signal processing.

Paper
Add Code

Open-vocabulary Keyword-spotting with Adaptive Instance Normalization

no code yet • 13 Sep 2023

Open vocabulary keyword spotting is a crucial and challenging task in automatic speech recognition (ASR) that focuses on detecting user-defined keywords within a spoken utterance.

Paper
Add Code

iPhonMatchNet: Zero-Shot User-Defined Keyword Spotting Using Implicit Acoustic Echo Cancellation

no code yet • 12 Sep 2023

In response to the increasing interest in human--machine communication across various domains, this paper introduces a novel approach called iPhonMatchNet, which addresses the challenge of barge-in scenarios, wherein user speech overlaps with device playback audio, thereby creating a self-referencing problem.

Paper
Add Code

Keyword Spotting

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result