Search Results for author: Ming Lei

Found 16 papers, 4 papers with code

Self-Critical Alternate Learning based Semantic Broadcast Communication

no code implementations • 3 Dec 2023 • Zhilin Lu, Rongpeng Li, Ming Lei, Chan Wang, Zhifeng Zhao, Honggang Zhang

In particular, to enable stable optimization via a nondifferentiable semantic metric, we regard sentence similarity as a reward and formulate this learning process as an RL problem.

Reinforcement Learning (RL) Semantic Similarity +3

Paper
Add Code

Joint Scattering Environment Sensing and Channel Estimation Based on Non-stationary Markov Random Field

no code implementations • 6 Feb 2023 • Wenkang Xu, Yongbo Xiao, An Liu, Ming Lei, MinJian Zhao

A location domain channel modeling method is proposed based on the position of targets and scatterers in the scattering environment, and the resulting radar and communication channels exhibit a two-dimensional (2-D) joint burst sparsity.

Bayesian Inference Position

Paper
Add Code

ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech

no code implementations • 16 Feb 2022 • Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao

Specifically, we first introduce a word-level prosody encoder, which quantizes the low-frequency band of the speech and compresses prosody attributes in the latent prosody vector (LPV).

Paper
Add Code

Speaker Embedding-aware Neural Diarization for Flexible Number of Speakers with Textual Information

2 code implementations • 28 Nov 2021 • Zhihao Du, Shiliang Zhang, Siqi Zheng, Weilong Huang, Ming Lei

In this paper, we reformulate this task as a single-label prediction problem by encoding the multi-speaker labels with power set.

Action Detection Activity Detection +2

3,321

Paper
Code

FedSpeech: Federated Text-to-Speech with Continual Learning

no code implementations • 14 Oct 2021 • Ziyue Jiang, Yi Ren, Ming Lei, Zhou Zhao

Federated learning enables collaborative training of machine learning models under strict privacy restrictions and federated text-to-speech aims to synthesize natural speech of multiple users with a few audio training samples stored in their devices locally.

Continual Learning Federated Learning

Paper
Add Code

BeamTransformer: Microphone Array-based Overlapping Speech Detection

no code implementations • 9 Sep 2021 • Siqi Zheng, Shiliang Zhang, Weilong Huang, Qian Chen, Hongbin Suo, Ming Lei, Jinwei Feng, Zhijie Yan

We propose BeamTransformer, an efficient architecture to leverage beamformer's edge in spatial filtering and transformer's capability in context sequence modeling.

Paper
Add Code

EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model

no code implementations • 17 Jun 2021 • Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao

Finally, by showing a comparable performance in the emotional speech synthesis task, we successfully demonstrate the ability of the proposed model.

Emotional Speech Synthesis Emotion Classification

Paper
Add Code

A PDD Decoder for Binary Linear Codes With Neural Check Polytope Projection

no code implementations • 11 Jun 2020 • Yi Wei, Ming-Min Zhao, Min-Jian Zhao, Ming Lei

Linear Programming (LP) is an important decoding technique for binary linear codes.

Paper
Add Code

Simplified Self-Attention for Transformer-based End-to-End Speech Recognition

no code implementations • 21 May 2020 • Haoneng Luo, Shiliang Zhang, Ming Lei, Lei Xie

Transformer models have been introduced into end-to-end speech recognition with state-of-the-art performance on various tasks owing to their superiority in modeling long-term dependencies.

speech-recognition Speech Recognition

Paper
Add Code

Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition

1 code implementation • 21 May 2020 • Shiliang Zhang, Zhifu Gao, Haoneng Luo, Ming Lei, Jie Gao, Zhijie Yan, Lei Xie

Recently, streaming end-to-end automatic speech recognition (E2E-ASR) has gained more and more attention.

Sound Audio and Speech Processing

3,321

Paper
Code

ADMM-based Decoder for Binary Linear Codes Aided by Deep Learning

no code implementations • 14 Feb 2020 • Yi Wei, Ming-Min Zhao, Min-Jian Zhao, Ming Lei

Inspired by the recent advances in deep learning (DL), this work presents a deep neural network aided decoding algorithm for binary linear codes.

Paper
Add Code

Learned Conjugate Gradient Descent Network for Massive MIMO Detection

1 code implementation • 10 Jun 2019 • Yi Wei, Ming-Min Zhao, Mingyi Hong, Min-Jian Zhao, Ming Lei

Furthermore, in order to reduce the memory costs, a novel quantized LcgNet is proposed, where a low-resolution nonuniform quantizer is integrated into the LcgNet to smartly quantize the aforementioned step-sizes.

Paper
Code

Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition

no code implementations • 27 Mar 2019 • Shiliang Zhang, Ming Lei, Zhijie Yan

Results in a 20, 000 hours Mandarin speech recognition task show that the proposed spelling correction model can achieve a CER of 3. 41%, which results in 22. 9% and 53. 2% relative improvement compared to the baseline CTC-based systems decoded with and without language model respectively.

Language Modelling Machine Translation +4

Paper
Add Code

Deep-FSMN for Large Vocabulary Continuous Speech Recognition

1 code implementation • 4 Mar 2018 • Shiliang Zhang, Ming Lei, Zhijie Yan, Li-Rong Dai

In a 20000 hours Mandarin recognition task, the LFR trained DFSMN can achieve more than 20% relative improvement compared to the LFR trained BLSTM.

Language Modelling speech-recognition +1

Paper
Code

Deep Feed-forward Sequential Memory Networks for Speech Synthesis

no code implementations • 26 Feb 2018 • Mengxiao Bi, Heng Lu, Shiliang Zhang, Ming Lei, Zhijie Yan

The Bidirectional LSTM (BLSTM) RNN based speech synthesis system is among the best parametric Text-to-Speech (TTS) systems in terms of the naturalness of generated speech, especially the naturalness in prosody.

speech-recognition Speech Recognition +1

Paper
Add Code

Data preprocessing methods for robust Fourier ptychographic microscopy

no code implementations • 4 Jun 2017 • Yan Zhang, An Pan, Ming Lei, Baoli Yao

Fourier ptychographic microscopy (FPM) is a recently proposed computational imaging technique with both high resolution and wide field-of-view.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.