no code implementations • 30 Oct 2018 • Yi-Chen Chen, Chia-Hao Shen, Sung-Feng Huang, Hung-Yi Lee, Lin-shan Lee
This can be learned by aligning a small number of spoken words and the corresponding text words in the embedding spaces.
no code implementations • 21 Jul 2018 • Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-Yi Lee, Lin-shan Lee
Stage 1 performs phonetic embedding with speaker characteristics disentangled.
no code implementations • 29 Mar 2018 • Yi-Chen Chen, Chia-Hao Shen, Sung-Feng Huang, Hung-Yi Lee
In this work, we propose a framework to achieve unsupervised ASR on a read English speech dataset, where audio and text are unaligned.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 19 Jul 2017 • Chia-Hao Shen, Janet Y. Sung, Hung-Yi Lee
We train SA from one language (source language) and use it to extract the vector representation of the audio segments of another language (target language).
1 code implementation • 3 Mar 2016 • Yu-An Chung, Chao-Chung Wu, Chia-Hao Shen, Hung-Yi Lee, Lin-shan Lee
The vector representations of fixed dimensionality for words (in text) offered by Word2Vec have been shown to be very useful in many application scenarios, in particular due to the semantic information they carry.