Search Results for author: Syu-Siang Wang

Found 15 papers, 3 papers with code

IANS: Intelligibility-aware Null-steering Beamforming for Dual-Microphone Arrays

no code implementations9 Jul 2023 Wen-Yuan Ting, Syu-Siang Wang, Yu Tsao, Borching Su

Beamforming techniques are popular in speech-related applications due to their effective spatial filtering capabilities.

Continuous Speech for Improved Learning Pathological Voice Disorders

no code implementations22 Feb 2022 Syu-Siang Wang, Chi-Te Wang, Chih-Chung Lai, Yu Tsao, Shih-Hau Fang

The experiments were conducted on a large-scale database, wherein 1, 045 continuous speech were collected by the speech clinic of a hospital from 2012 to 2019.

Speech Enhancement-assisted Voice Conversion in Noisy Environments

no code implementations19 Oct 2021 Yun-Ju Chan, Chiang-Jen Peng, Syu-Siang Wang, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi

Numerous voice conversion (VC) techniques have been proposed for the conversion of voices among different speakers.

Speech Enhancement Voice Conversion

Speech Enhancement Based on Cyclegan with Noise-informed Training

no code implementations19 Oct 2021 Wen-Yuan Ting, Syu-Siang Wang, Hsin-Li Chang, Borching Su, Yu Tsao

Herein, we investigate a potential limitation of the clean-to-noisy conversion part and propose a novel noise-informed training (NIT) approach to improve the performance of the original CycleGAN SE system.

Speech Enhancement

Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario

1 code implementation7 Jan 2021 Chiang-Jen Peng, Yun-Ju Chan, Cheng Yu, Syu-Siang Wang, Yu Tsao, Tai-Shih Chi

In this study, we propose an attention-based MTL (ATM) approach that integrates MTL and the attention-weighting mechanism to simultaneously realize a multi-model learning structure that performs speech enhancement (SE) and speaker identification (SI).

Multi-Task Learning Speaker Identification +1

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

1 code implementation21 Aug 2020 Yu-Wen Chen, Kuo-Hsuan Hung, You-Jin Li, Alexander Chao-Fu Kang, Ya-Hsin Lai, Kai-Chun Liu, Szu-Wei Fu, Syu-Siang Wang, Yu Tsao

The CITISEN provides three functions: speech enhancement (SE), model adaptation (MA), and background noise conversion (BNC), allowing CITISEN to be used as a platform for utilizing and evaluating SE models and flexibly extend the models to address various noise environments and users.

Acoustic Scene Classification Data Augmentation +2

Boosting Objective Scores of a Speech Enhancement Model by MetricGAN Post-processing

no code implementations18 Jun 2020 Szu-Wei Fu, Chien-Feng Liao, Tsun-An Hsieh, Kuo-Hsuan Hung, Syu-Siang Wang, Cheng Yu, Heng-Cheng Kuo, Ryandhimas E. Zezario, You-Jin Li, Shang-Yi Chuang, Yen-Ju Lu, Yu Tsao

The Transformer architecture has demonstrated a superior ability compared to recurrent neural networks in many different natural language processing applications.

Speech Enhancement

MIMO Speech Compression and Enhancement Based on Convolutional Denoising Autoencoder

no code implementations24 May 2020 You-Jin Li, Syu-Siang Wang, Yu Tsao, Borching Su

For speech-related applications in IoT environments, identifying effective methods to handle interference noises and compress the amount of data in transmissions is essential to achieve high-quality services.

Denoising

Time-Domain Multi-modal Bone/air Conducted Speech Enhancement

no code implementations22 Nov 2019 Cheng Yu, Kuo-Hsuan Hung, Syu-Siang Wang, Szu-Wei Fu, Yu Tsao, Jeih-weih Hung

Previous studies have proven that integrating video signals, as a complementary modality, can facilitate improved performance for speech enhancement (SE).

Ensemble Learning Speech Enhancement

Speech Enhancement Based on Reducing the Detail Portion of Speech Spectrograms in Modulation Domain via Discrete Wavelet Transform

1 code implementation8 Nov 2018 Shih-kuang Lee, Syu-Siang Wang, Yu Tsao, Jeih-weih Hung

The presented DWT-based SE method with various scaling factors for the detail part is evaluated with a subset of Aurora-2 database, and the PESQ metric is used to indicate the quality of processed speech signals.

Speech Enhancement

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

no code implementations1 Sep 2017 Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang

Precisely speaking, the proposed AVDCNN model is structured as an audio-visual encoder-decoder network, in which audio and visual data are first processed using individual CNNs, and then fused into a joint network to generate enhanced speech (the primary task) and reconstructed images (the secondary task) at the output layer.

Multi-Task Learning Speech Enhancement

Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks

no code implementations30 Mar 2017 Jen-Cheng Hou, Syu-Siang Wang, Ying-Hui Lai, Yu Tsao, Hsiu-Wen Chang, Hsin-Min Wang

Precisely speaking, the proposed AVDCNN model is structured as an audio-visual encoder-decoder network, in which audio and visual data are first processed using individual CNNs, and then fused into a joint network to generate enhanced speech (the primary task) and reconstructed images (the secondary task) at the output layer.

Multi-Task Learning Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.