no code implementations • 27 Sep 2023 • Avamarie Brueggeman, Takuya Higuchi, Masood Delfarah, Stephen Shum, Vineet Garg
Our investigation reveals that SE can improve KWS accuracy on noisy speech when the backend model is trained on clean speech; however, despite our extensive exploration, it is difficult to improve the KWS accuracy with SE when the backend is trained on noisy speech.
no code implementations • 27 Sep 2023 • Takuya Higuchi, Avamarie Brueggeman, Masood Delfarah, Stephen Shum
Voice triggering (VT) enables users to activate their devices by just speaking a trigger phrase.
no code implementations • 5 Apr 2022 • Prateeth Nayak, Takuya Higuchi, Anmol Gupta, Shivesh Ranjan, Stephen Shum, Siddharth Sigtia, Erik Marchi, Varun Lakshminarasimhan, Minsik Cho, Saurabh Adya, Chandra Dhir, Ahmed Tewfik
A detector is typically trained on speech data independent of speaker information and used for the voice trigger detection task.
no code implementations • 6 Aug 2020 • Filip Granqvist, Matt Seigel, Rogier Van Dalen, Áine Cahill, Stephen Shum, Matthias Paulik
From these features, the model predicts speaker characteristic labels considered useful as side information.