Search Results for author: Ming Lei

Found 16 papers, 4 papers with code

Self-Critical Alternate Learning based Semantic Broadcast Communication

no code implementations3 Dec 2023 Zhilin Lu, Rongpeng Li, Ming Lei, Chan Wang, Zhifeng Zhao, Honggang Zhang

In particular, to enable stable optimization via a nondifferentiable semantic metric, we regard sentence similarity as a reward and formulate this learning process as an RL problem.

Reinforcement Learning (RL) Semantic Similarity +3

Joint Scattering Environment Sensing and Channel Estimation Based on Non-stationary Markov Random Field

no code implementations6 Feb 2023 Wenkang Xu, Yongbo Xiao, An Liu, Ming Lei, MinJian Zhao

A location domain channel modeling method is proposed based on the position of targets and scatterers in the scattering environment, and the resulting radar and communication channels exhibit a two-dimensional (2-D) joint burst sparsity.

Bayesian Inference Position

ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech

no code implementations16 Feb 2022 Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao

Specifically, we first introduce a word-level prosody encoder, which quantizes the low-frequency band of the speech and compresses prosody attributes in the latent prosody vector (LPV).

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

2 code implementations28 Nov 2021 Zhihao Du, Shiliang Zhang, Siqi Zheng, Weilong Huang, Ming Lei

In this paper, we reformulate this task as a single-label prediction problem by encoding the multi-speaker labels with power set.

Action Detection Activity Detection +2

FedSpeech: Federated Text-to-Speech with Continual Learning

no code implementations14 Oct 2021 Ziyue Jiang, Yi Ren, Ming Lei, Zhou Zhao

Federated learning enables collaborative training of machine learning models under strict privacy restrictions and federated text-to-speech aims to synthesize natural speech of multiple users with a few audio training samples stored in their devices locally.

Continual Learning Federated Learning

BeamTransformer: Microphone Array-based Overlapping Speech Detection

no code implementations9 Sep 2021 Siqi Zheng, Shiliang Zhang, Weilong Huang, Qian Chen, Hongbin Suo, Ming Lei, Jinwei Feng, Zhijie Yan

We propose BeamTransformer, an efficient architecture to leverage beamformer's edge in spatial filtering and transformer's capability in context sequence modeling.

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

no code implementations17 Jun 2021 Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao

Finally, by showing a comparable performance in the emotional speech synthesis task, we successfully demonstrate the ability of the proposed model.

Emotional Speech Synthesis Emotion Classification

A PDD Decoder for Binary Linear Codes With Neural Check Polytope Projection

no code implementations11 Jun 2020 Yi Wei, Ming-Min Zhao, Min-Jian Zhao, Ming Lei

Linear Programming (LP) is an important decoding technique for binary linear codes.

Simplified Self-Attention for Transformer-based End-to-End Speech Recognition

no code implementations21 May 2020 Haoneng Luo, Shiliang Zhang, Ming Lei, Lei Xie

Transformer models have been introduced into end-to-end speech recognition with state-of-the-art performance on various tasks owing to their superiority in modeling long-term dependencies.

speech-recognition Speech Recognition

Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition

1 code implementation21 May 2020 Shiliang Zhang, Zhifu Gao, Haoneng Luo, Ming Lei, Jie Gao, Zhijie Yan, Lei Xie

Recently, streaming end-to-end automatic speech recognition (E2E-ASR) has gained more and more attention.

Sound Audio and Speech Processing

ADMM-based Decoder for Binary Linear Codes Aided by Deep Learning

no code implementations14 Feb 2020 Yi Wei, Ming-Min Zhao, Min-Jian Zhao, Ming Lei

Inspired by the recent advances in deep learning (DL), this work presents a deep neural network aided decoding algorithm for binary linear codes.

Learned Conjugate Gradient Descent Network for Massive MIMO Detection

1 code implementation10 Jun 2019 Yi Wei, Ming-Min Zhao, Mingyi Hong, Min-Jian Zhao, Ming Lei

Furthermore, in order to reduce the memory costs, a novel quantized LcgNet is proposed, where a low-resolution nonuniform quantizer is integrated into the LcgNet to smartly quantize the aforementioned step-sizes.

Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition

no code implementations27 Mar 2019 Shiliang Zhang, Ming Lei, Zhijie Yan

Results in a 20, 000 hours Mandarin speech recognition task show that the proposed spelling correction model can achieve a CER of 3. 41%, which results in 22. 9% and 53. 2% relative improvement compared to the baseline CTC-based systems decoded with and without language model respectively.

Language Modelling Machine Translation +4

Deep-FSMN for Large Vocabulary Continuous Speech Recognition

1 code implementation4 Mar 2018 Shiliang Zhang, Ming Lei, Zhijie Yan, Li-Rong Dai

In a 20000 hours Mandarin recognition task, the LFR trained DFSMN can achieve more than 20% relative improvement compared to the LFR trained BLSTM.

Language Modelling speech-recognition +1

Deep Feed-forward Sequential Memory Networks for Speech Synthesis

no code implementations26 Feb 2018 Mengxiao Bi, Heng Lu, Shiliang Zhang, Ming Lei, Zhijie Yan

The Bidirectional LSTM (BLSTM) RNN based speech synthesis system is among the best parametric Text-to-Speech (TTS) systems in terms of the naturalness of generated speech, especially the naturalness in prosody.

speech-recognition Speech Recognition +1

Data preprocessing methods for robust Fourier ptychographic microscopy

no code implementations4 Jun 2017 Yan Zhang, An Pan, Ming Lei, Baoli Yao

Fourier ptychographic microscopy (FPM) is a recently proposed computational imaging technique with both high resolution and wide field-of-view.

Cannot find the paper you are looking for? You can Submit a new open access paper.