no code implementations • ACL 2022 • Liming Wang, Siyuan Feng, Mark Hasegawa-Johnson, Chang Yoo
Phonemes are defined by their relationship to words: changing a phoneme changes the word.
no code implementations • 9 Dec 2023 • Taijie Chen, Zijian Shen, Siyuan Feng, Linchuan Yang, Jintao Ke
To simultaneously maximize multiple system performance metrics for matching radius determination, we devise a novel multi-task learning algorithm that enhances convergence speed of each task (corresponding to the optimization of one metric) and delivers more accurate overall predictions.
no code implementations • 1 Nov 2023 • Ruihang Lai, Junru Shao, Siyuan Feng, Steven S. Lyubomirsky, Bohan Hou, Wuwei Lin, Zihao Ye, Hongyi Jin, Yuchen Jin, Jiawei Liu, Lesheng Jin, Yaxing Cai, Ziheng Jiang, Yong Wu, Sunghyun Park, Prakalp Srivastava, Jared G. Roesch, Todd C. Mowry, Tianqi Chen
Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models.
no code implementations • 5 Jun 2023 • Qianqian Dong, Zhiying Huang, Qiao Tian, Chen Xu, Tom Ko, Yunlong Zhao, Siyuan Feng, Tang Li, Kexin Wang, Xuxin Cheng, Fengpeng Yue, Ye Bai, Xi Chen, Lu Lu, Zejun Ma, Yuping Wang, Mingxuan Wang, Yuxuan Wang
For the speech synthesis part, we adopt the existing VALL-E X approach and build a unit-based audio language model.
no code implementations • 19 May 2023 • Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang
Moreover, on 3 of the 4 languages, comparing to the standard HuBERT, the approach performs better, meanwhile is able to save supervised training data by 1. 5k hours (75%) at most.
no code implementations • 19 May 2023 • Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang
Our main approach and adaptation are effective on extremely low-resource languages, even within domain- and language-mismatched scenarios.
1 code implementation • 22 Mar 2023 • Siyuan Feng, Taijie Chen, Yuhao Zhang, Jintao Ke, Zhengfei Zheng, Hai Yang
In addition, the existing simulators still face many challenges, ranging from their closeness to real environments of ride-sourcing systems, to the completeness of different tasks they can implement.
no code implementations • 23 Feb 2023 • Kehua Chen, Jindong Han, Siyuan Feng, Hai Yang
In this paper, we propose Semantic-Fused Hierarchical Graph Transfer Learning (SF-HGTL) model to achieve knowledge transfer across cities with fused semantics.
no code implementations • 12 Jul 2022 • Duy-Nguyen Ta, Eric Cousineau, Huihua Zhao, Siyuan Feng
We present our findings in the gap between theory and practice of using conditional energy-based models (EBM) as an implicit representation for behavior-cloned policies.
2 code implementations • 9 Jul 2022 • Siyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen
Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives.
no code implementations • 31 May 2022 • Gaode Chen, Yijun Su, Xinghua Zhang, Anmin Hu, Guochun Chen, Siyuan Feng, Ji Xiang, Junbo Zhang, Yu Zheng
To address the above challenging problems, we propose a novel Cross-city Federated Transfer Learning framework (CcFTL) to cope with the data insufficiency and privacy problems.
no code implementations • 26 May 2022 • Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen
Experimental results show that MetaSchedule can cover the search space used in the state-of-the-art tensor program optimization frameworks in a modular way.
1 code implementation • 26 Jan 2022 • Piotr Żelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak
In this paper, we 1) investigate the influence of different factors (i. e., model architecture, phonotactic model, type of speech representation) on phone recognition in an unknown language; 2) provide an analysis of which phones transfer well across languages and which do not in order to understand the limitations of and areas for further improvement for automatic phone inventory creation; and 3) present different methods to build a phone inventory of an unseen language in an unsupervised way.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 13 Jan 2022 • Luke Prananta, Bence Mark Halpern, Siyuan Feng, Odette Scharenborg
In this paper, we investigate several existing and a new state-of-the-art generative adversarial network-based (GAN) voice conversion method for enhancing dysarthric speech for improved dysarthric speech recognition.
no code implementations • 29 Sep 2021 • Liming Wang, Siyuan Feng, Mark A. Hasegawa-Johnson, Chang D. Yoo
Phonemes are defined by their relationship to words: changing a phoneme changes the word.
1 code implementation • 2 Apr 2021 • Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Odette Scharenborg
In the first stage, a recently proposed method in the task of unsupervised subword modeling is improved by replacing a monolingual out-of-domain (OOD) ASR system with a multilingual one to create a subword-discriminative representation that is more language-independent.
1 code implementation • 28 Mar 2021 • Siyuan Feng, Olya Kudina, Bence Mark Halpern, Odette Scharenborg
Practice and recent evidence suggests that the state-of-the-art (SotA) ASRs struggle with the large variation in speech due to e. g., gender, age, speech impairment, race, and accents.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 17 Dec 2020 • Siyuan Feng, Odette Scharenborg
Taken together, the analyses showed that the two stages in our approach are both effective in capturing phoneme and AF information.
no code implementations • 11 Nov 2020 • Jintao Ke, Siyuan Feng, Zheng Zhu, Hai Yang, Jieping Ye
To address this issue, we propose a deep multi-task multi-graph learning approach, which combines two components: (1) multiple multi-graph convolutional (MGC) networks for predicting demands for different service modes, and (2) multi-task learning modules that enable knowledge sharing across multiple MGC networks.
no code implementations • 3 Nov 2020 • Man-Ling Sung, Siyuan Feng, Tan Lee
With the unsupervisedly trained acoustic models, a given audio archive is represented by a pseudo transcription, from which spoken keywords can be discovered by string mining algorithms.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 23 Oct 2020 • Xinsheng Wang, Siyuan Feng, Jihua Zhu, Mark Hasegawa-Johnson, Odette Scharenborg
This paper proposes a new model, referred to as the show and speak (SAS) model that, for the first time, is able to directly synthesize spoken descriptions of images, bypassing the need for any text or phonemes.
1 code implementation • 22 Oct 2020 • Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Ali Abavisani, Mark Hasegawa-Johnson, Odette Scharenborg, Najim Dehak
Furthermore, we find that a multilingual LM hurts a multilingual ASR system's performance, and retaining only the target language's phonotactic data in LM training is preferable.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 Jul 2020 • Siyuan Feng
The first problem concerns unsupervised discovery of basic (subword level) speech units in a given language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 25 Jul 2020 • Siyuan Feng, Odette Scharenborg
Our system is less sensitive to training data amount when the training data is over 50 hours.
no code implementations • 30 Oct 2019 • Zhiyuan Peng, Siyuan Feng, Tan Lee
The USM experiments on ZeroSpeech 2017 dataset verify that the frame tokenizer is able to capture linguistic content and the utterance embedder can acquire speaker-related information.
no code implementations • 9 Aug 2019 • Siyuan Feng, Tan Lee
Out-of-domain ASR systems can be applied to perform speaker adaptation with untranscribed training data of the target language, and to decode the training speech into frame-level labels for DNN training.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 17 Jun 2019 • Siyuan Feng, Tan Lee, Zhiyuan Peng
Experimental results on ZeroSpeech 2017 show that both approaches are effective while the latter is more prominent, and that their combination brings further marginal improvement in across-speaker condition.
no code implementations • 17 Jun 2019 • Siyuan Feng, Tan Lee
This study tackles unsupervised subword modeling in the zero-resource scenario, learning frame-level speech representation that is phonetically discriminative and speaker-invariant, using only untranscribed speech for target languages.
1 code implementation • 13 May 2019 • Huichu Zhang, Siyuan Feng, Chang Liu, Yaoyao Ding, Yichen Zhu, Zihan Zhou, Wei-Nan Zhang, Yong Yu, Haiming Jin, Zhenhui Li
The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios.
Multi-agent Reinforcement Learning reinforcement-learning +1
2 code implementations • ICLR 2019 • Sidi Lu, Lantao Yu, Siyuan Feng, Yaoming Zhu, Wei-Nan Zhang, Yong Yu
In this paper, we study the generative models of sequential discrete data.