Search Results for author: Iván Vallés-Pérez

Found 6 papers, 3 papers with code

Enhancing the Stability of LLM-based Speech Generation Systems through Self-Supervised Representations

no code implementations • 5 Feb 2024 • Álvaro Martín-Cortinas, Daniel Sáez-Trigueros, Iván Vallés-Pérez, Biel Tura-Vecino, Piotr Biliński, Mateusz Lajszczak, Grzegorz Beringer, Roberto Barra-Chicote, Jaime Lorenzo-Trueba

Using speaker-disentangled codes to train LLMs for text-to-speech (TTS) allows the LLM to generate the content and the style of the speech only from the text, similarly to humans, while the speaker identity is provided by the decoder of the VC model.

In-Context Learning Voice Conversion

Paper
Add Code

Empirical study of the modulus as activation function in computer vision applications

1 code implementation • 15 Jan 2023 • Iván Vallés-Pérez, Emilio Soria-Olivas, Marcelino Martínez-Sober, Antonio J. Serrano-López, Joan Vila-Francés, Juan Gómez-Sanchís

In this work we propose a new non-monotonic activation function: the modulus.

Paper
Code

Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech

no code implementations • 4 Nov 2022 • Xin Zhang, Iván Vallés-Pérez, Andreas Stolcke, Chengzhu Yu, Jasha Droppo, Olabanji Shonibare, Roberto Barra-Chicote, Venkatesh Ravichandran

By fine-tuning an ASR model on synthetic stuttered speech we are able to reduce word error by 5. 7% relative on stuttered utterances, with only minor (<0. 2% relative) degradation for fluent utterances.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Approaching sales forecasting using recurrent neural networks and transformers

1 code implementation • 16 Apr 2022 • Iván Vallés-Pérez, Emilio Soria-Olivas, Marcelino Martínez-Sober, Antonio J. Serrano-López, Juan Gómez-Sanchís, Fernando Mateo

Accurate and fast demand forecast is one of the hot topics in supply chain for enabling the precise execution of the corresponding downstream processes (inbound and outbound planning, inventory placement, network planning, etc).

Paper
Code

End-to-end Keyword Spotting using Xception-1d

1 code implementation • 9 Oct 2021 • Iván Vallés-Pérez, Juan Gómez-Sanchis, Marcelino Martínez-Sober, Joan Vila-Francés, Antonio J. Serrano-López, Emilio Soria-Olivas

The field of conversational agents is growing fast and there is an increasing need for algorithms that enhance natural interaction.

Keyword Spotting

Paper
Code

Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows

no code implementations • 10 Jun 2021 • Iván Vallés-Pérez, Julian Roth, Grzegorz Beringer, Roberto Barra-Chicote, Jasha Droppo

This paper proposes a new neural text-to-speech model that approaches the disentanglement problem by conditioning a Tacotron2-like architecture on flow-normalized speaker embeddings, and by substituting the reference encoder with a new learned latent distribution responsible for modeling the intra-sentence variability due to the prosody.

Disentanglement Sentence

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.