Discriminatory and orthogonal feature learning for noise robust keyword spotting

20 Oct 2022 · Donghyeon Kim, Kyungdeuk Ko, David K. Han, Hanseok Ko ·

Keyword Spotting (KWS) is an essential component in a smart device for alerting the system when a user prompts it with a command. As these devices are typically constrained by computational and energy resources, the KWS model should be designed with a small footprint. In our previous work, we developed lightweight dynamic filters which extract a robust feature map within a noisy environment. The learning variables of the dynamic filter are jointly optimized with KWS weights by using Cross-Entropy (CE) loss. CE loss alone, however, is not sufficient for high performance when the SNR is low. In order to train the network for more robust performance in noisy environments, we introduce the LOw Variant Orthogonal (LOVO) loss. The LOVO loss is composed of a triplet loss applied on the output of the dynamic filter, a spectral norm-based orthogonal loss, and an inner class distance loss applied in the KWS model. These losses are particularly useful in encouraging the network to extract discriminatory features in unseen noise environments.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Keyword Spotting

Datasets

WHAM!

Results from the Paper

Edit

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

Triplet Loss

Edit Social Preview

Discriminatory and orthogonal feature learning for noise robust keyword spotting

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove