Paper tables with annotated results for Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

Paper

Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

We propose smoothed max pooling loss and its application to keyword spotting systems. The proposed approach jointly trains an encoder (to detect keyword parts) and a decoder (to detect whole keyword) in a semi-supervised manner. The proposed new loss function allows training a model to detect parts and whole of a keyword, without strictly depending on frame-level labeling from LVCSR (Large vocabulary continuous speech recognition), making further optimization possible. The proposed system outperforms the baseline keyword spotting model in [1] due to increased optimizability. Further, it can be more easily adapted for on-device learning applications due to reduced dependency on LVCSR.

PDF Paper record

Results in Papers With Code

(↓ scroll down to see all results)

Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

Reader Guidelines

Editor Guidelines