Semi-Supervised Speech Recognition via Local Prior Matching

24 Feb 2020Wei-Ning HsuAnn LeeGabriel SynnaeveAwni Hannun

For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability. In this work, we propose local prior matching (LPM), a semi-supervised objective that distills knowledge from a strong prior (e.g. a language model) to provide learning signal to a discriminative model trained on unlabeled speech... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK BENCHMARK
Speech Recognition LibriSpeech test-clean Local Prior Matching (Large Model) Word Error Rate (WER) 7.19 # 25
Speech Recognition LibriSpeech test-other Local Prior Matching (Large Model) Word Error Rate (WER) 20.84 # 20
Speech Recognition LibriSpeech test-other Local Prior Matching (Large Model, ConvLM LM) Word Error Rate (WER) 15.28 # 18

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet