Search Results for author: Florian Lux

Found 12 papers, 7 papers with code

Controllable Generation of Artificial Speaker Embeddings through Discovery of Principal Directions

no code implementations • 26 Oct 2023 • Florian Lux, Pascal Tilli, Sarina Meyer, Ngoc Thang Vu

Customizing voice and speaking style in a speech synthesis system with intuitive and fine-grained controls is challenging, given that little data with appropriate labels is available.

Speech Synthesis

Paper
Add Code

The IMS Toucan System for the Blizzard Challenge 2023

1 code implementation • 26 Oct 2023 • Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu

For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021.

457

Paper
Code

Low-Resource Multilingual and Zero-Shot Multispeaker TTS

1 code implementation • 21 Oct 2022 • Florian Lux, Julia Koch, Ngoc Thang Vu

While neural methods for text-to-speech (TTS) have shown great advances in modeling multiple speakers, even in zero-shot settings, the amount of data needed for those approaches is generally not feasible for the vast majority of the world's over 6, 000 spoken languages.

Meta-Learning Voice Cloning

457

Paper
Code

Combining Contrastive and Non-Contrastive Losses for Fine-Tuning Pretrained Models in Speech Analysis

no code implementations • 21 Oct 2022 • Florian Lux, Ching-Yi Chen, Ngoc Thang Vu

This pretrained model is then finetuned to a specific task.

Emotion Classification

Paper
Add Code

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

1 code implementation • 13 Oct 2022 • Sarina Meyer, Pascal Tilli, Pavel Denisov, Florian Lux, Julia Koch, Ngoc Thang Vu

In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings.

Generative Adversarial Network

Paper
Code

Speaker Anonymization with Phonetic Intermediate Representations

1 code implementation • 11 Jul 2022 • Sarina Meyer, Florian Lux, Pavel Denisov, Julia Koch, Pascal Tilli, Ngoc Thang Vu

In this work, we propose a speaker anonymization pipeline that leverages high quality automatic speech recognition and synthesis systems to generate speech conditioned on phonetic transcriptions and anonymized speaker embeddings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

PoeticTTS -- Controllable Poetry Reading for Literary Studies

no code implementations • 11 Jul 2022 • Julia Koch, Florian Lux, Nadja Schauffler, Toni Bernhart, Felix Dieterle, Jonas Kuhn, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu

Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech.

Speech Synthesis

Paper
Add Code

Exact Prosody Cloning in Zero-Shot Multispeaker Text-to-Speech

2 code implementations • 24 Jun 2022 • Florian Lux, Julia Koch, Ngoc Thang Vu

The cloning of a speaker's voice using an untranscribed reference sample is one of the great advances of modern neural text-to-speech (TTS) methods.

457

Paper
Code

Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features

1 code implementation • ACL 2022 • Florian Lux, Ngoc Thang Vu

While neural text-to-speech systems perform remarkably well in high-resource scenarios, they cannot be applied to the majority of the over 6, 000 spoken languages in the world due to a lack of appropriate training data.

Meta-Learning

457

Paper
Code

Meta-Learning for improving rare word recognition in end-to-end ASR

no code implementations • 25 Feb 2021 • Florian Lux, Ngoc Thang Vu

We propose a new method of generating meaningful embeddings for speech, changes to four commonly used meta learning approaches to enable them to perform keyword spotting in continuous signals and an approach of combining their outcomes into an end-to-end automatic speech recognition system to improve rare word recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents

1 code implementation • ACL 2020 • Chia-Yu Li, Daniel Ortega, Dirk Väth, Florian Lux, Lindsey Vanderlyn, Maximilian Schmidt, Michael Neumann, Moritz Völkel, Pavel Denisov, Sabrina Jenne, Zorica Kacarevic, Ngoc Thang Vu

We present ADVISER - an open-source, multi-domain dialog system toolkit that enables the development of multi-modal (incorporating speech, text and vision), socially-engaged (e. g. emotion recognition, engagement level prediction and backchanneling) conversational agents.

BIG-bench Machine Learning Emotion Recognition

Paper
Code

Multiclass Text Classification on Unbalanced, Sparse and Noisy Data

no code implementations • WS 2019 • Matthias Damaschk, Tillmann D{\"o}nicke, Florian Lux

This paper discusses methods to improve the performance of text classification on data that is difficult to classify due to a large number of unbalanced classes with noisy examples.

General Classification text-classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.