Search Results for author: Guanglai Gao

Found 23 papers, 10 papers with code

L$^2$GC: Lorentzian Linear Graph Convolutional Networks For Node Classification

1 code implementation10 Mar 2024 Qiuyu Liang, Weihua Wang, Feilong Bao, Guanglai Gao

Specifically, we map the learned features of graph nodes into hyperbolic space, and then perform a Lorentzian linear feature transformation to capture the underlying tree-like structure of data.

Node Classification

Learning Noise-Robust Joint Representation for Multimodal Emotion Recognition under Realistic Incomplete Data Scenarios

1 code implementation21 Sep 2023 Qi Fan, Haolin Zuo, Rui Liu, Zheng Lian, Guanglai Gao

Multimodal emotion recognition (MER) in practical scenarios presents a significant challenge due to the presence of incomplete data, such as missing or noisy data.

Multimodal Emotion Recognition

TransERR: Translation-based Knowledge Graph Embedding via Efficient Relation Rotation

2 code implementations26 Jun 2023 Jiang Li, Xiangdong Su, Fujun Zhang, Guanglai Gao

This paper presents a translation-based knowledge geraph embedding method via efficient relation rotation (TransERR), a straightforward yet effective alternative to traditional translation-based knowledge graph embedding models.

Knowledge Graph Embedding Mathematical Proofs +2

Betray Oneself: A Novel Audio DeepFake Detection Model via Mono-to-Stereo Conversion

1 code implementation25 May 2023 Rui Liu, Jinhua Zhang, Guanglai Gao, Haizhou Li

In this paper, we propose a novel ADD model, termed as M2S-ADD, that attempts to discover audio authenticity cues during the mono-to-stereo conversion process.

DeepFake Detection Face Swapping +1

MnTTS2: An Open-Source Multi-Speaker Mongolian Text-to-Speech Synthesis Dataset

1 code implementation11 Dec 2022 Kailin Liang, Bin Liu, Yifan Hu, Rui Liu, Feilong Bao, Guanglai Gao

Text-to-Speech (TTS) synthesis for low-resource languages is an attractive research issue in academia and industry nowadays.

Speech Synthesis Text-To-Speech Synthesis

Explicit Intensity Control for Accented Text-to-speech

no code implementations27 Oct 2022 Rui Liu, Haolin Zuo, De Hu, Guanglai Gao, Haizhou Li

Accented text-to-speech (TTS) synthesis seeks to generate speech with an accent (L2) as a variant of the standard version (L1).

speech-recognition Speech Recognition

FCTalker: Fine and Coarse Grained Context Modeling for Expressive Conversational Speech Synthesis

1 code implementation27 Oct 2022 Yifan Hu, Rui Liu, Guanglai Gao, Haizhou Li

Therefore, we propose a novel expressive conversational TTS model, termed as FCTalker, that learn the fine and coarse grained context dependency at the same time during speech generation.

Speech Synthesis

A Deep Investigation of RNN and Self-attention for the Cyrillic-Traditional Mongolian Bidirectional Conversion

no code implementations24 Sep 2022 Muhan Na, Rui Liu, Feilong, Guanglai Gao

To answer this question, this paper investigates the utility of these two powerful techniques for CTMBC task combined with agglutinative characteristics of Mongolian language.

Machine Translation

MnTTS: An Open-Source Mongolian Text-to-Speech Synthesis Dataset and Accompanied Baseline

1 code implementation22 Sep 2022 Yifan Hu, Pengkai Yin, Rui Liu, Feilong Bao, Guanglai Gao

This paper introduces a high-quality open-source text-to-speech (TTS) synthesis dataset for Mongolian, a low-resource language spoken by over 10 million people worldwide.

Speech Synthesis Text-To-Speech Synthesis

Controllable Accented Text-to-Speech Synthesis

no code implementations22 Sep 2022 Rui Liu, Berrak Sisman, Guanglai Gao, Haizhou Li

Accented TTS synthesis is challenging as L2 is different from L1 in both in terms of phonetic rendering and prosody pattern.

Speech Synthesis Text-To-Speech Synthesis

Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning

1 code implementation15 Jun 2022 Rui Liu, Berrak Sisman, Björn Schuller, Guanglai Gao, Haizhou Li

In this paper, we propose a data-driven deep learning model, i. e. StrengthNet, to improve the generalization of emotion strength assessment for seen and unseen speech.

Attribute Emotion Classification +2

Guided Training: A Simple Method for Single-channel Speaker Separation

no code implementations26 Mar 2021 Hao Li, Xueliang Zhang, Guanglai Gao

Another way is to use an anchor speech, a short speech of the target speaker, to model the speaker identity.

Speaker Separation Speech Separation

Modeling Prosodic Phrasing with Multi-Task Learning in Tacotron-based TTS

no code implementations11 Aug 2020 Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li

We propose a multi-task learning scheme for Tacotron training, that optimizes the system to predict both Mel spectrum and phrase breaks.

Multi-Task Learning Speech Synthesis

An Edge Information and Mask Shrinking Based Image Inpainting Approach

no code implementations11 Jun 2020 Huali Xu, Xiangdong Su, Meng Wang, Xiang Hao, Guanglai Gao

The mask shrinking strategy is employed in the image completion model to track the areas to be repaired.

Image Inpainting valid

Sub-Band Knowledge Distillation Framework for Speech Enhancement

no code implementations29 May 2020 Xiang Hao, Shixue Wen, Xiangdong Su, Yun Liu, Guanglai Gao, Xiaofei Li

In single-channel speech enhancement, methods based on full-band spectral features have been widely studied.

Knowledge Distillation Speech Enhancement

WaveTTS: Tacotron-based TTS with Joint Time-Frequency Domain Loss

no code implementations2 Feb 2020 Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li

To address this problem, we propose a new training scheme for Tacotron-based TTS, referred to as WaveTTS, that has 2 loss functions: 1) time-domain loss, denoted as the waveform loss, that measures the distortion between the natural and generated waveform; and 2) frequency-domain loss, that measures the Mel-scale acoustic feature loss between the natural and generated acoustic features.

Teacher-Student Training for Robust Tacotron-based TTS

no code implementations7 Nov 2019 Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao, Haizhou Li

We first train a Tacotron2-based TTS model by always providing natural speech frames to the decoder, that serves as a teacher model.

Knowledge Distillation

A LSTM Approach with Sub-Word Embeddings for Mongolian Phrase Break Prediction

no code implementations COLING 2018 Rui Liu, Feilong Bao, Guanglai Gao, HUI ZHANG, Yonghe Wang

In this paper, we first utilize the word embedding that focuses on sub-word units to the Mongolian Phrase Break (PB) prediction task by using Long-Short-Term-Memory (LSTM) model.

Dictionary Learning Machine Translation +2

Mongolian Named Entity Recognition System with Rich Features

no code implementations COLING 2016 Weihua Wang, Feilong Bao, Guanglai Gao

The system based on segmenting suffixes with all proposed features yields benchmark result of F-measure=84. 65 on this corpus.

Machine Translation named-entity-recognition +3

Cannot find the paper you are looking for? You can Submit a new open access paper.