Speech

Expressive Speech Synthesis

11 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Expressive Speech Synthesis

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Latest papers with no code

Most implemented Social Latest No code

Multi-Speaker Expressive Speech Synthesis via Semi-supervised Contrastive Learning

no code yet • 26 Oct 2023

This paper aims to build an expressive TTS system for multi-speakers, synthesizing a target speaker's speech with multiple styles and emotions.

Paper
Add Code

Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis

no code yet • 31 Aug 2023

The spontaneous behavior that often occurs in conversations makes speech more human-like compared to reading-style.

Paper
Add Code

Cross-lingual Prosody Transfer for Expressive Machine Dubbing

no code yet • 20 Jun 2023

Prosody transfer is well-studied in the context of expressive speech synthesis.

Paper
Add Code

Ensemble prosody prediction for expressive speech synthesis

no code yet • 3 Apr 2023

Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech.

Paper
Add Code

On granularity of prosodic representations in expressive text-to-speech

no code yet • 26 Jan 2023

In expressive speech synthesis it is widely adopted to use latent prosody representations to deal with variability of the data during training.

Paper
Add Code

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling

no code yet • 19 Nov 2022

This paper aims to synthesize the target speaker's speech with desired speaking style and emotion by transferring the style and emotion from reference speech recorded by other speakers.

Paper
Add Code

Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis

no code yet • 2 Nov 2022

A large part of the expressive speech synthesis literature focuses on learning prosodic representations of the speech signal which are then modeled by a prior distribution during inference.

Paper
Add Code

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis

no code yet • 1 Nov 2022

We show that the fine-grained latent space also captures coarse-grained information, which is more evident as the dimension of latent space increases in order to capture diverse prosodic representations.

Paper
Add Code

Self-supervised Context-aware Style Representation for Expressive Speech Synthesis

no code yet • 25 Jun 2022

In this paper, we propose a novel framework for learning style representation from abundant plain text in a self-supervised manner.

Paper
Add Code

Fine-grained Noise Control for Multispeaker Speech Synthesis

no code yet • 11 Apr 2022

A text-to-speech (TTS) model typically factorizes speech attributes such as content, speaker and prosody into disentangled representations. Recent works aim to additionally model the acoustic conditions explicitly, in order to disentangle the primary speech factors, i. e. linguistic content, prosody and timbre from any residual factors, such as recording conditions and background noise. This paper proposes unsupervised, interpretable and fine-grained noise and prosody modeling.

Paper
Add Code

Expressive Speech Synthesis

Benchmarks Add a Result

Latest papers with no code

Content

Benchmarks

Add a Result