Search Results for author: Yannis Stylianou

Found 16 papers, 5 papers with code

Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise

no code implementations20 Mar 2022 Tuomo Raitio, Petko Petkov, Jiangchuan Li, Muhammed Shifas, Andrea Davis, Yannis Stylianou

We present a neural text-to-speech (TTS) method that models natural vocal effort variation to improve the intelligibility of synthetic speech in the presence of noise.

Combining speakers of multiple languages to improve quality of neural voices

no code implementations17 Aug 2021 Javier Latorre, Charlotte Bailleul, Tuuli Morrill, Alistair Conkie, Yannis Stylianou

In this work, we explore multiple architectures and training procedures for developing a multi-speaker and multi-lingual neural TTS system with the goals of a) improving the quality when the available data in the target language is limited and b) enabling cross-lingual synthesis.

Enhancing Speech Intelligibility in Text-To-Speech Synthesis using Speaking Style Conversion

1 code implementation13 Aug 2020 Dipjyoti Paul, Muhammed PV Shifas, Yannis Pantazis, Yannis Stylianou

Intelligibility enhancement as quantified by the Intelligibility in Bits (SIIB-Gauss) measure shows that the proposed Lombard-SSDRC TTS system shows significant relative improvement between 110% and 130% in speech-shaped noise (SSN), and 47% to 140% in competing-speaker noise (CSN) against the state-of-the-art TTS approach.

Speech Synthesis Text-To-Speech Synthesis +1

Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions

1 code implementation9 Aug 2020 Dipjyoti Paul, Yannis Pantazis, Yannis Stylianou

In terms of performance, our system has been preferred over the baseline TTS system by 60% over 15. 5% and by 60. 9% over 32. 6%, for seen and unseen speakers, respectively.

Speech Synthesis

Audiovisual Speech Synthesis using Tacotron2

no code implementations3 Aug 2020 Ahmed Hussen Abdelaziz, Anushree Prasanna Kumar, Chloe Seivwright, Gabriele Fanelli, Justin Binder, Yannis Stylianou, Sachin Kajarekar

The output acoustic features are used to condition a WaveRNN to reconstruct the speech waveform, and the output facial controllers are used to generate the corresponding video of the talking face.

Face Model Sentence +1

Cumulant GAN

no code implementations11 Jun 2020 Yannis Pantazis, Dipjyoti Paul, Michail Fasoulakis, Yannis Stylianou, Markos Katsoulakis

In this paper, we propose a novel loss function for training Generative Adversarial Networks (GANs) aiming towards deeper theoretical understanding as well as improved stability and performance for the underlying optimization problem.

Image Generation

A non-causal FFTNet architecture for speech enhancement

1 code implementation8 Jun 2020 Muhammed PV Shifas, Nagaraj Adiga, Vassilis Tsiaras, Yannis Stylianou

By suggesting a shallow network and applying non-causality within certain limits, the suggested FFTNet for speech enhancement (SE-FFTNet) uses much fewer parameters compared to other neural network based approaches for speech enhancement like WaveNet and SEGAN.

Speech Enhancement

Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra

no code implementations31 May 2020 Thomas Drugman, Yannis Stylianou

Recent studies have shown that its proper estimation and modeling enhance the quality of statistical parametric speech synthesizers.

Training Generative Adversarial Networks with Weights

no code implementations6 Nov 2018 Yannis Pantazis, Dipjyoti Paul, Michail Fasoulakis, Yannis Stylianou

The impressive success of Generative Adversarial Networks (GANs) is often overshadowed by the difficulties in their training.

Speech intelligibility enhancement based on a non-causal Wavenet-like model

1 code implementation Interspeech 2018 Muhammed Shifas PV, Vassilis Tsiaras, Yannis Stylianou

Low speech intelligibility in noisy listening conditions makes more difficult our communication with others.

Automatic acoustic detection of birds through deep learning: the first Bird Audio Detection challenge

no code implementations16 Jul 2018 Dan Stowell, Yannis Stylianou, Mike Wood, Hanna Pamuła, Hervé Glotin

Assessing the presence and abundance of birds is important for monitoring specific species as well as overall ecosystem health.

Sound Audio and Speech Processing

Spoken Dialogue for Information Navigation

no code implementations WS 2018 Alex Papangelis, ros, Panagiotis Papadakos, Yannis Stylianou, Yannis Tzitzikas

Aiming to expand the current research paradigm for training conversational AI agents that can address real-world challenges, we take a step away from traditional slot-filling goal-oriented spoken dialogue systems (SDS) and model the dialogue in a way that allows users to be more expressive in describing their needs.

Information Retrieval slot-filling +2

LD-SDS: Towards an Expressive Spoken Dialogue System based on Linked-Data

no code implementations9 Oct 2017 Alexandros Papangelis, Panagiotis Papadakos, Margarita Kotti, Yannis Stylianou, Yannis Tzitzikas, Dimitris Plexousakis

In this work we discuss the related challenges and describe an approach towards the fusion of state-of-the-art technologies from the Spoken Dialogue Systems (SDS) and the Semantic Web and Information Retrieval domains.

Conversational Search Information Retrieval +4

Bird detection in audio: a survey and a challenge

no code implementations11 Aug 2016 Dan Stowell, Mike Wood, Yannis Stylianou, Hervé Glotin

Many biological monitoring projects rely on acoustic detection of birds.

Sound

Cannot find the paper you are looking for? You can Submit a new open access paper.