Search Results for author: Simon King

Found 18 papers, 4 papers with code

Natural language guidance of high-fidelity text-to-speech with synthetic annotations

no code implementations • 2 Feb 2024 • Dan Lyth, Simon King

We propose a scalable method for labeling various aspects of speaker identity, style, and recording conditions.

Paper
Add Code

Differentiable Grey-box Modelling of Phaser Effects using Frame-based Spectral Processing

no code implementations • 2 Jun 2023 • Alistair Carson, Cassia Valentini-Botinhao, Simon King, Stefan Bilbao

The frame duration is an important hyper-parameter of the proposed model, so an investigation was carried out into its effect on model accuracy.

Paper
Add Code

Controllable Speaking Styles Using a Large Language Model

no code implementations • 17 May 2023 • Atli Thor Sigurgeirsson, Simon King

Given only a natural language query text (the prompt), such models can be used to solve specific, context-dependent tasks.

Language Modelling Large Language Model +1

Paper
Add Code

Ensemble prosody prediction for expressive speech synthesis

no code implementations • 3 Apr 2023 • Tian Huey Teh, Vivian Hu, Devang S Ram Mohan, Zack Hodari, Christopher G. R. Wallis, Tomás Gomez Ibarrondo, Alexandra Torresquintero, James Leoni, Mark Gales, Simon King

Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech.

Ensemble Learning Expressive Speech Synthesis +1

Paper
Add Code

Do Prosody Transfer Models Transfer Prosody?

no code implementations • 7 Mar 2023 • Atli Thor Sigurgeirsson, Simon King

This is done by using a learned embedding of the reference utterance, which is used to condition speech generation.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

ADEPT: A Dataset for Evaluating Prosody Transfer

no code implementations • 15 Jun 2021 • Alexandra Torresquintero, Tian Huey Teh, Christopher G. R. Wallis, Marlene Staib, Devang S Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King

Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity.

Paper
Add Code

Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis

no code implementations • 15 Jun 2021 • Devang S Ram Mohan, Vivian Hu, Tian Huey Teh, Alexandra Torresquintero, Christopher G. R. Wallis, Marlene Staib, Lorenzo Foglianti, Jiameng Gao, Simon King

Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text.

Speech Synthesis

Paper
Add Code

Using previous acoustic context to improve Text-to-Speech synthesis

no code implementations • 7 Dec 2020 • Pilar Oplustil-Gallegos, Simon King

Many speech synthesis datasets, especially those derived from audiobooks, naturally comprise sequences of utterances.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0

1 code implementation • 14 Mar 2020 • Zack Hodari, Catherine Lai, Simon King

In English, prosody adds a broad range of information to segment sequences, from information structure (e. g. contrast) to stylistic variation (e. g. expression of emotion).

Clustering Representation Learning +1

Paper
Code

Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis

2 code implementations • 28 Feb 2020 • Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King

Our NN predicts MOS with a high correlation to human judgments.

Speech Synthesis Text-To-Speech Synthesis +1

Paper
Code

Using generative modelling to produce varied intonation for speech synthesis

1 code implementation • 10 Jun 2019 • Zack Hodari, Oliver Watts, Simon King

A generative model that can synthesise multiple prosodies will, by design, not model average prosody.

Sentence Speech Synthesis

Paper
Code

Attentive Filtering Networks for Audio Replay Attack Detection

1 code implementation • 31 Oct 2018 • Cheng-I Lai, Alberto Abad, Korin Richmond, Junichi Yamagishi, Najim Dehak, Simon King

In this work, we propose our replay attacks detection system - Attentive Filtering Network, which is composed of an attention-based filtering mechanism that enhances feature representations in both the frequency and time domains, and a ResNet-based classifier.

Speaker Verification

Paper
Code

Median-Based Generation of Synthetic Speech Durations using a Non-Parametric Approach

no code implementations • 22 Aug 2016 • Srikanth Ronanki, Oliver Watts, Simon King, Gustav Eje Henter

This paper proposes a new approach to duration modelling for statistical parametric speech synthesis in which a recurrent statistical model is trained to output a phone transition probability at each timestep (acoustic frame).

Speech Synthesis

Paper
Add Code

DNN-based Speech Synthesis for Indian Languages from ASCII text

no code implementations • 18 Aug 2016 • Srikanth Ronanki, Siva Reddy, Bajibabu Bollepalli, Simon King

These methods first convert the ASCII text to a phonetic script, and then learn a Deep Neural Network to synthesize speech from that.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

Improving Trajectory Modelling for DNN-based Speech Synthesis by using Stacked Bottleneck Features and Minimum Generation Error Training

no code implementations • 22 Feb 2016 • Zhizheng Wu, Simon King

We propose two novel techniques --- stacking bottleneck features and minimum generation error training criterion --- to improve the performance of deep neural network (DNN)-based speech synthesis.

Speech Synthesis

Paper
Add Code

Investigating gated recurrent neural networks for speech synthesis

no code implementations • 11 Jan 2016 • Zhizheng Wu, Simon King

Recently, recurrent neural networks (RNNs) as powerful sequence models have re-emerged as a potential acoustic model for statistical parametric speech synthesis (SPSS).

Speech Synthesis

Paper
Add Code

A Comparison of Manual and Automatic Voice Repair for Individual with Vocal Disabilities

no code implementations • WS 2015 • Christophe Veaux, Junichi Yamagishi, Simon King

Speech Synthesis

Paper
Add Code

Towards Personalised Synthesised Voices for Individuals with Vocal Disabilities: Voice Banking and Reconstruction

no code implementations • WS 2013 • Christophe Veaux, Junichi Yamagishi, Simon King

Speech Synthesis

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.