Search Results for author: Siyuan Feng

Found 30 papers, 9 papers with code

Dynamic Adjustment of Matching Radii under the Broadcasting Mode: A Novel Multitask Learning Strategy and Temporal Modeling Approach

no code implementations9 Dec 2023 Taijie Chen, Zijian Shen, Siyuan Feng, Linchuan Yang, Jintao Ke

To simultaneously maximize multiple system performance metrics for matching radius determination, we devise a novel multi-task learning algorithm that enhances convergence speed of each task (corresponding to the optimization of one metric) and delivers more accurate overall predictions.

Multi-Task Learning

Language-Universal Phonetic Representation in Multilingual Speech Pretraining for Low-Resource Speech Recognition

no code implementations19 May 2023 Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang

Moreover, on 3 of the 4 languages, comparing to the standard HuBERT, the approach performs better, meanwhile is able to save supervised training data by 1. 5k hours (75%) at most.

Self-Supervised Learning speech-recognition +1

Language-universal phonetic encoder for low-resource speech recognition

no code implementations19 May 2023 Siyuan Feng, Ming Tu, Rui Xia, Chuanzeng Huang, Yuxuan Wang

Our main approach and adaptation are effective on extremely low-resource languages, even within domain- and language-mismatched scenarios.

speech-recognition Speech Recognition

A multi-functional simulation platform for on-demand ride service operations

1 code implementation22 Mar 2023 Siyuan Feng, Taijie Chen, Yuhao Zhang, Jintao Ke, Zhengfei Zheng, Hai Yang

In addition, the existing simulators still face many challenges, ranging from their closeness to real environments of ride-sourcing systems, to the completeness of different tasks they can implement.

Cross-City Traffic Prediction via Semantic-Fused Hierarchical Graph Transfer Learning

no code implementations23 Feb 2023 Kehua Chen, Jindong Han, Siyuan Feng, Hai Yang

In this paper, we propose Semantic-Fused Hierarchical Graph Transfer Learning (SF-HGTL) model to achieve knowledge transfer across cities with fused semantics.

Management Retrieval +2

Conditional Energy-Based Models for Implicit Policies: The Gap between Theory and Practice

no code implementations12 Jul 2022 Duy-Nguyen Ta, Eric Cousineau, Huihua Zhao, Siyuan Feng

We present our findings in the gap between theory and practice of using conditional energy-based models (EBM) as an implicit representation for behavior-cloned policies.

regression

TensorIR: An Abstraction for Automatic Tensorized Program Optimization

2 code implementations9 Jul 2022 Siyuan Feng, Bohan Hou, Hongyi Jin, Wuwei Lin, Junru Shao, Ruihang Lai, Zihao Ye, Lianmin Zheng, Cody Hao Yu, Yong Yu, Tianqi Chen

Finally, we build an end-to-end framework on top of our abstraction to automatically optimize deep learning models for given tensor computation primitives.

BIG-bench Machine Learning

A Cross-City Federated Transfer Learning Framework: A Case Study on Urban Region Profiling

no code implementations31 May 2022 Gaode Chen, Yijun Su, Xinghua Zhang, Anmin Hu, Guochun Chen, Siyuan Feng, Ji Xiang, Junbo Zhang, Yu Zheng

To address the above challenging problems, we propose a novel Cross-city Federated Transfer Learning framework (CcFTL) to cope with the data insufficiency and privacy problems.

Transfer Learning

Tensor Program Optimization with Probabilistic Programs

no code implementations26 May 2022 Junru Shao, Xiyou Zhou, Siyuan Feng, Bohan Hou, Ruihang Lai, Hongyi Jin, Wuwei Lin, Masahiro Masuda, Cody Hao Yu, Tianqi Chen

Experimental results show that MetaSchedule can cover the search space used in the state-of-the-art tensor program optimization frameworks in a modular way.

Probabilistic Programming

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

1 code implementation26 Jan 2022 Piotr Żelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak

In this paper, we 1) investigate the influence of different factors (i. e., model architecture, phonotactic model, type of speech representation) on phone recognition in an unknown language; 2) provide an analysis of which phones transfer well across languages and which do not in order to understand the limitations of and areas for further improvement for automatic phone inventory creation; and 3) present different methods to build a phone inventory of an unseen language in an unsupervised way.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for Improved Dysarthric Speech Recognition

no code implementations13 Jan 2022 Luke Prananta, Bence Mark Halpern, Siyuan Feng, Odette Scharenborg

In this paper, we investigate several existing and a new state-of-the-art generative adversarial network-based (GAN) voice conversion method for enhancing dysarthric speech for improved dysarthric speech recognition.

Generative Adversarial Network speech-recognition +2

Unsupervised Acoustic Unit Discovery by Leveraging a Language-Independent Subword Discriminative Feature Representation

1 code implementation2 Apr 2021 Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Odette Scharenborg

In the first stage, a recently proposed method in the task of unsupervised subword modeling is improved by replacing a monolingual out-of-domain (OOD) ASR system with a multilingual one to create a subword-discriminative representation that is more language-independent.

Acoustic Unit Discovery Clustering

Quantifying Bias in Automatic Speech Recognition

1 code implementation28 Mar 2021 Siyuan Feng, Olya Kudina, Bence Mark Halpern, Odette Scharenborg

Practice and recent evidence suggests that the state-of-the-art (SotA) ASRs struggle with the large variation in speech due to e. g., gender, age, speech impairment, race, and accents.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

The effectiveness of unsupervised subword modeling with autoregressive and cross-lingual phone-aware networks

no code implementations17 Dec 2020 Siyuan Feng, Odette Scharenborg

Taken together, the analyses showed that the two stages in our approach are both effective in capturing phoneme and AF information.

Self-Supervised Learning Transfer Learning

Joint predictions of multi-modal ride-hailing demands: a deep multi-task multigraph learning-based approach

no code implementations11 Nov 2020 Jintao Ke, Siyuan Feng, Zheng Zhu, Hai Yang, Jieping Ye

To address this issue, we propose a deep multi-task multi-graph learning approach, which combines two components: (1) multiple multi-graph convolutional (MGC) networks for predicting demands for different service modes, and (2) multi-task learning modules that enable knowledge sharing across multiple MGC networks.

Graph Learning Multi-Task Learning

Unsupervised Pattern Discovery from Thematic Speech Archives Based on Multilingual Bottleneck Features

no code implementations3 Nov 2020 Man-Ling Sung, Siyuan Feng, Tan Lee

With the unsupervisedly trained acoustic models, a given audio archive is represented by a pseudo transcription, from which spoken keywords can be discovered by string mining algorithms.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Show and Speak: Directly Synthesize Spoken Description of Images

1 code implementation23 Oct 2020 Xinsheng Wang, Siyuan Feng, Jihua Zhu, Mark Hasegawa-Johnson, Odette Scharenborg

This paper proposes a new model, referred to as the show and speak (SAS) model that, for the first time, is able to directly synthesize spoken descriptions of images, bypassing the need for any text or phonemes.

How Phonotactics Affect Multilingual and Zero-shot ASR Performance

1 code implementation22 Oct 2020 Siyuan Feng, Piotr Żelasko, Laureano Moro-Velázquez, Ali Abavisani, Mark Hasegawa-Johnson, Odette Scharenborg, Najim Dehak

Furthermore, we find that a multilingual LM hurts a multilingual ASR system's performance, and retaining only the target language's phonotactic data in LM training is preferable.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Unsupervised Subword Modeling Using Autoregressive Pretraining and Cross-Lingual Phone-Aware Modeling

no code implementations25 Jul 2020 Siyuan Feng, Odette Scharenborg

Our system is less sensitive to training data amount when the training data is over 50 hours.

Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal

no code implementations30 Oct 2019 Zhiyuan Peng, Siyuan Feng, Tan Lee

The USM experiments on ZeroSpeech 2017 dataset verify that the frame tokenizer is able to capture linguistic content and the utterance embedder can acquire speaker-related information.

Clustering Speaker Verification

Exploiting Cross-Lingual Speaker and Phonetic Diversity for Unsupervised Subword Modeling

no code implementations9 Aug 2019 Siyuan Feng, Tan Lee

Out-of-domain ASR systems can be applied to perform speaker adaptation with untranscribed training data of the target language, and to decode the training speech into frame-level labels for DNN training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Combining Adversarial Training and Disentangled Speech Representation for Robust Zero-Resource Subword Modeling

no code implementations17 Jun 2019 Siyuan Feng, Tan Lee, Zhiyuan Peng

Experimental results on ZeroSpeech 2017 show that both approaches are effective while the latter is more prominent, and that their combination brings further marginal improvement in across-speaker condition.

Representation Learning

Improving Unsupervised Subword Modeling via Disentangled Speech Representation Learning and Transformation

no code implementations17 Jun 2019 Siyuan Feng, Tan Lee

This study tackles unsupervised subword modeling in the zero-resource scenario, learning frame-level speech representation that is phonetically discriminative and speaker-invariant, using only untranscribed speech for target languages.

Clustering Representation Learning

CityFlow: A Multi-Agent Reinforcement Learning Environment for Large Scale City Traffic Scenario

1 code implementation13 May 2019 Huichu Zhang, Siyuan Feng, Chang Liu, Yaoyao Ding, Yichen Zhu, Zihan Zhou, Wei-Nan Zhang, Yong Yu, Haiming Jin, Zhenhui Li

The most commonly used open-source traffic simulator SUMO is, however, not scalable to large road network and large traffic flow, which hinders the study of reinforcement learning on traffic scenarios.

Multi-agent Reinforcement Learning reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.