Search Results for author: Meng Yu

Found 37 papers, 7 papers with code

Joint Conditional Diffusion Model for Image Restoration with Mixed Degradations

no code implementations • 11 Apr 2024 • Yufeng Yue, Meng Yu, Luojie Yang, Yi Yang

Image restoration is rather challenging in adverse weather conditions, especially when multiple degradations occur simultaneously.

Image Restoration

Paper
Add Code

VIFNet: An End-to-end Visible-Infrared Fusion Network for Image Dehazing

no code implementations • 11 Apr 2024 • Meng Yu, Te Cui, Haoyang Lu, Yufeng Yue

Image dehazing poses significant challenges in environmental perception.

Image Dehazing

Paper
Add Code

Deep Audio Zooming: Beamwidth-Controllable Neural Beamformer

no code implementations • 22 Nov 2023 • Meng Yu, Dong Yu

Audio zooming, a signal processing technique, enables selective focusing and enhancement of sound signals from a specified region, attenuating others.

Paper
Add Code

Advancing Acoustic Howling Suppression through Recursive Training of Neural Networks

no code implementations • 27 Sep 2023 • Hao Zhang, Yixuan Zhang, Meng Yu, Dong Yu

In this paper, we introduce a novel training framework designed to comprehensively address the acoustic howling issue by examining its fundamental formation process.

Acoustic echo cancellation

Paper
Add Code

Neural Network Augmented Kalman Filter for Robust Acoustic Howling Suppression

no code implementations • 27 Sep 2023 • Yixuan Zhang, Hao Zhang, Meng Yu, Dong Yu

Acoustic howling suppression (AHS) is a critical challenge in audio communication systems.

Paper
Add Code

Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions

no code implementations • 16 Sep 2023 • Heming Wang, Meng Yu, Hao Zhang, Chunlei Zhang, Zhongweiyang Xu, Muqiao Yang, Yixuan Zhang, Dong Yu

Enhancing speech signal quality in adverse acoustic environments is a persistent challenge in speech processing.

Speech Enhancement

Paper
Add Code

Hybrid AHS: A Hybrid of Kalman Filter and Deep Learning for Acoustic Howling Suppression

no code implementations • 4 May 2023 • Hao Zhang, Meng Yu, Yuzhong Wu, Tao Yu, Dong Yu

During offline training, a pre-processed signal obtained from the Kalman filter and an ideal microphone signal generated via teacher-forced training strategy are used to train the deep neural network (DNN).

Paper
Add Code

Deep Learning for Joint Acoustic Echo and Acoustic Howling Suppression in Hybrid Meetings

no code implementations • 2 May 2023 • Hao Zhang, Meng Yu, Dong Yu

In particular, the interplay between acoustic echo and acoustic howling in a hybrid meeting makes the joint suppression of them difficult.

Speech Separation

Paper
Add Code

Deep AHS: A Deep Learning Approach to Acoustic Howling Suppression

no code implementations • 18 Feb 2023 • Hao Zhang, Meng Yu, Dong Yu

In this paper, we formulate acoustic howling suppression (AHS) as a supervised learning problem and propose a deep learning approach, called Deep AHS, to address it.

Speech Separation

Paper
Add Code

NeuralKalman: A Learnable Kalman Filter for Acoustic Echo Cancellation

no code implementations • 29 Jan 2023 • Yixuan Zhang, Meng Yu, Hao Zhang, Dong Yu, DeLiang Wang

The robustness of the Kalman filter to double talk and its rapid convergence make it a popular approach for addressing acoustic echo cancellation (AEC) challenges.

Acoustic echo cancellation

Paper
Add Code

Deep Neural Mel-Subband Beamformer for In-car Speech Separation

no code implementations • 22 Nov 2022 • Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu

While current deep learning (DL)-based beamforming techniques have been proved effective in speech separation, they are often designed to process narrow-band (NB) frequencies independently which results in higher computational costs and inference times, making them unsuitable for real-world use.

Speech Separation

Paper
Add Code

Identification of cancer-keeping genes as therapeutic targets by finding network control hubs

no code implementations • 13 Jun 2022 • Xizhe Zhang, Chunyu Pan, Xinru Wei, Meng Yu, Shuangjie Liu, Jun An, Jieping Yang, Baojun Wei, Wenjun Hao, Yang Yao, Yuyan Zhu, Weixiong Zhang

One of the recent approaches is based on network structural controllability that focuses on finding a control scheme and driver genes that can steer the cell from an arbitrary state to a designated state.

Paper
Add Code

NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement

no code implementations • 20 May 2022 • Meng Yu, Yong Xu, Chunlei Zhang, Shi-Xiong Zhang, Dong Yu

Acoustic echo cancellation (AEC) plays an important role in the full-duplex speech communication as well as the front-end speech enhancement for recognition in the conditions when the loudspeaker plays back.

Acoustic echo cancellation Speech Enhancement +2

Paper
Add Code

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

1 code implementation • 31 Mar 2022 • Soumi Maiti, Yushi Ueda, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu

In this paper, we present a novel framework that jointly performs three tasks: speaker diarization, speech separation, and speaker counting.

Decoder speaker-diarization +2

7,914

Paper
Code

Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE

no code implementations • 30 Mar 2022 • Ziang Long, Yunling Zheng, Meng Yu, Jack Xin

Variational auto-encoder (VAE) is an effective neural network architecture to disentangle a speech utterance into speaker identity and linguistic content latent embeddings, then generate an utterance for a target speaker from that of a source speaker.

Decoder Sentence +1

Paper
Add Code

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization

no code implementations • 29 Nov 2021 • Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu

Conversational bilingual speech encompasses three types of utterances: two purely monolingual types and one intra-sententially code-switched type.

speech-recognition Speech Recognition

Paper
Add Code

Joint Neural AEC and Beamforming with Double-Talk Detection

no code implementations • 9 Nov 2021 • Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu

We train the proposed model in an end-to-end approach to eliminate background noise and echoes from far-end audio devices, which include nonlinear distortions.

Acoustic echo cancellation Denoising +2

Paper
Add Code

FAST-RIR: Fast neural diffuse room impulse response generator

2 code implementations • 7 Oct 2021 • Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu

We present a neural-network-based fast diffuse room impulse response generator (FAST-RIR) for generating room impulse responses (RIRs) for a given acoustic environment.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

137

Paper
Code

MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation

no code implementations • 17 Apr 2021 • Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu

The spatial self-attention module is designed to attend on the cross-channel correlation in the covariance matrices.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment

no code implementations • 2 Apr 2021 • Meng Yu, Chunlei Zhang, Yong Xu, ShiXiong Zhang, Dong Yu

The objective speech quality assessment is usually conducted by comparing received speech signal with its clean reference, while human beings are capable of evaluating the speech quality without any reference, such as in the mean opinion score (MOS) tests.

Paper
Add Code

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation

no code implementations • 31 Mar 2021 • Helin Wang, Bo Wu, LianWu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu

In this paper, we exploit the effective way to leverage contextual information to improve the speech dereverberation performance in real-world reverberant environments.

Room Impulse Response (RIR) Speech Dereverberation

Paper
Add Code

Towards Robust Speaker Verification with Target Speaker Enhancement

no code implementations • 16 Mar 2021 • Chunlei Zhang, Meng Yu, Chao Weng, Dong Yu

This paper proposes the target speaker enhancement based speaker verification network (TASE-SVNet), an all neural model that couples target speaker enhancement and speaker embedding extraction for robust speaker verification (SV).

Speaker Verification Speech Enhancement

Paper
Add Code

Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

no code implementations • 16 Feb 2021 • Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

In addition to using the prediction error as a metric for evaluating our localization model, we also establish its potency as a frontend with automatic speech recognition (ASR) as the downstream task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Multi-channel Multi-frame ADL-MVDR for Target Speech Separation

no code implementations • 24 Dec 2020 • Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, LianWu Chen, Donald S. Williamson, Dong Yu

Many purely neural network based speech separation approaches have been proposed to improve objective assessment scores, but they often introduce nonlinear distortions that are harmful to modern automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Self-supervised Text-independent Speaker Verification using Prototypical Momentum Contrastive Learning

1 code implementation • 13 Dec 2020 • Wei Xia, Chunlei Zhang, Chao Weng, Meng Yu, Dong Yu

First, we examine a simple contrastive learning approach (SimCLR) with a momentum contrastive (MoCo) learning framework, where the MoCo speaker embedding system utilizes a queue to maintain a large set of negative examples.

Clustering Contrastive Learning +2

Paper
Code

Improving RNN Transducer With Target Speaker Extraction and Neural Uncertainty Estimation

no code implementations • 26 Nov 2020 • Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

Target-speaker speech recognition aims to recognize target-speaker speech from noisy environments with background noise and interfering speakers.

Speech Enhancement Speech Extraction +1 Sound Audio and Speech Processing

Paper
Add Code

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization

no code implementations • 30 Oct 2020 • Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Yong Xu, Shi-Xiong Zhang, Dong Yu

The advantages of D-ASR over existing methods are threefold: (1) it provides explicit speaker locations, (2) it improves the explainability factor, and (3) it achieves better ASR performance as the process is more streamlined.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

1 code implementation • 21 Aug 2020 • Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen

Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target speech signals, respectively, from a mixture of sounds generated by several sources.

Speech Enhancement Speech Separation

196

Paper
Code

ADL-MVDR: All deep learning MVDR beamformer for target speech separation

1 code implementation • 16 Aug 2020 • Zhuohuang Zhang, Yong Xu, Meng Yu, Shi-Xiong Zhang, LianWu Chen, Dong Yu

Speech separation algorithms are often used to separate the target speech from other interfering sources.

Speech Separation

Paper
Code

Neural Spatio-Temporal Beamformer for Target Speech Separation

1 code implementation • 8 May 2020 • Yong Xu, Meng Yu, Shi-Xiong Zhang, Lian-Wu Chen, Chao Weng, Jianming Liu, Dong Yu

Purely neural network (NN) based speech separation and enhancement methods, although can achieve good objective scores, inevitably cause nonlinear speech distortions that are harmful for the automatic speech recognition (ASR).

Audio and Speech Processing Sound

Paper
Code

Enhancing End-to-End Multi-channel Speech Separation via Spatial Feature Learning

no code implementations • 9 Mar 2020 • Rongzhi Gu, Shi-Xiong Zhang, Lian-Wu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu

Hand-crafted spatial features (e. g., inter-channel phase difference, IPD) play a fundamental role in recent deep learning based multi-channel speech separation (MCSS) methods.

Speech Separation

Paper
Add Code

A Unified Framework for Speech Separation

no code implementations • 17 Dec 2019 • Fahimeh Bahmaninezhad, Shi-Xiong Zhang, Yong Xu, Meng Yu, John H. L. Hansen, Dong Yu

The initial solutions introduced for deep learning based speech separation analyzed the speech signals into time-frequency domain with STFT; and then encoded mixed signals were fed into a deep neural network based separator.

Speech Separation

Paper
Add Code

Audio-Visual Speech Separation and Dereverberation with a Two-Stage Multimodal Network

no code implementations • 16 Sep 2019 • Ke Tan, Yong Xu, Shi-Xiong Zhang, Meng Yu, Dong Yu

Background noise, interfering speech and room reverberation frequently distort target speech in real listening environments.

Audio and Speech Processing Sound Signal Processing

Paper
Add Code

DurIAN: Duration Informed Attention Network For Multimodal Synthesis

4 code implementations • 4 Sep 2019 • Chengzhu Yu, Heng Lu, Na Hu, Meng Yu, Chao Weng, Kun Xu, Peng Liu, Deyi Tuo, Shiyin Kang, Guangzhi Lei, Dan Su, Dong Yu

In this paper, we present a generic and robust multimodal synthesis system that produces highly natural speech and facial expression simultaneously.

Speech Synthesis

181

Paper
Code

A comprehensive study of speech separation: spectrogram vs waveform separation

no code implementations • 17 May 2019 • Fahimeh Bahmaninezhad, Jian Wu, Rongzhi Gu, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu

We study the speech separation problem for far-field data (more similar to naturalistic audio streams) and develop multi-channel solutions for both frequency and time-domain separators with utilizing spectral, spatial and speaker location information.

speech-recognition Speech Recognition +1

Paper
Add Code

End-to-End Multi-Channel Speech Separation

no code implementations • 15 May 2019 • Rongzhi Gu, Jian Wu, Shi-Xiong Zhang, Lian-Wu Chen, Yong Xu, Meng Yu, Dan Su, Yuexian Zou, Dong Yu

This paper extended the previous approach and proposed a new end-to-end model for multi-channel speech separation.

Speech Separation

Paper
Add Code

Time Domain Audio Visual Speech Separation

no code implementations • 7 Apr 2019 • Jian Wu, Yong Xu, Shi-Xiong Zhang, Lian-Wu Chen, Meng Yu, Lei Xie, Dong Yu

Audio-visual multi-modal modeling has been demonstrated to be effective in many speech related tasks, such as speech recognition and speech enhancement.

Audio and Speech Processing Sound

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.