Search Results for author: Siyang Wang

Found 10 papers, 1 papers with code

Evaluating Sampling-based Filler Insertion with Spontaneous TTS

no code implementations • LREC 2022 • Siyang Wang, Joakim Gustafson, Éva Székely

Perceptual results show little difference between compared filler insertion models including with ground-truth, which may be due to the ambiguity of what is good filler insertion and a strong neural spontaneous TTS that produces natural speech irrespective of input.

Paper
Add Code

On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis

no code implementations • 11 Jul 2023 • Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Prior work has shown that SSL is an effective intermediate representation in two-stage text-to-speech (TTS) for both read and spontaneous speech.

Self-Supervised Learning Speech Synthesis

Paper
Add Code

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis

no code implementations • 15 Jun 2023 • Shivam Mehta, Siyang Wang, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter

With read-aloud speech synthesis achieving high naturalness scores, there is a growing research interest in synthesising spontaneous speech.

Denoising Speech Synthesis

Paper
Add Code

Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis

no code implementations • 29 May 2023 • Erik Ekstedt, Siyang Wang, Éva Székely, Joakim Gustafson, Gabriel Skantze

Turn-taking is a fundamental aspect of human communication where speakers convey their intention to either hold, or yield, their turn through prosodic cues.

Speech Synthesis

Paper
Add Code

A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS

no code implementations • 5 Mar 2023 • Siyang Wang, Gustav Eje Henter, Joakim Gustafson, Éva Székely

Recent work has explored using self-supervised learning (SSL) speech representations such as wav2vec2. 0 as the representation medium in standard two-stage TTS, in place of conventionally used mel-spectrograms.

Self-Supervised Learning

Paper
Add Code

Integrated Speech and Gesture Synthesis

1 code implementation • 25 Aug 2021 • Siyang Wang, Simon Alexanderson, Joakim Gustafson, Jonas Beskow, Gustav Eje Henter, Éva Székely

Text-to-speech and co-speech gesture synthesis have until now been treated as separate areas by two different research communities, and applications merely stack the two technologies using a simple system-level pipeline.

Speech Synthesis

Paper
Code

Theory of the Chromatic Dispersion, Revisited

no code implementations • 30 Oct 2020 • Dimitar Popmintchev, Siyang Wang, Xiaoshi Zhang, Tenio Popmintchev

We derive general analytic expressions for the chromatic dispersion orders valid to infinity, due to the k vector or phase {\phi} dependence on the wavelength.

Optics Applied Physics Atomic and Molecular Clusters

Paper
Add Code

Unaligned Image-to-Sequence Transformation with Loop Consistency

no code implementations • ICLR 2020 • Siyang Wang, Justin Lazarow, Kwonjoon Lee, Zhuowen Tu

We tackle the problem of modeling sequential visual phenomena.

Paper
Add Code

Attention is all you need for Videos: Self-attention based Video Summarization using Universal Transformers

no code implementations • 6 Jun 2019 • Manjot Bilkhu, Siyang Wang, Tushar Dobhal

Video Captioning and Summarization have become very popular in the recent years due to advancements in Sequence Modelling, with the resurgence of Long-Short Term Memory networks (LSTMs) and introduction of Gated Recurrent Units (GRUs).

Dense Video Captioning Dimensionality Reduction +1

Paper
Add Code

Controllable Top-down Feature Transformer

no code implementations • 6 Dec 2017 • Zhiwei Jia, Haoshen Hong, Siyang Wang, Kwonjoon Lee, Zhuowen Tu

We study the intrinsic transformation of feature maps across convolutional network layers with explicit top-down control.

Data Augmentation Style Transfer

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.