Search Results for author: Juan F. Montesinos

Found 6 papers, 5 papers with code

Speech inpainting: Context-based speech synthesis guided by video

no code implementations1 Jun 2023 Juan F. Montesinos, Daniel Michelsanti, Gloria Haro, Zheng-Hua Tan, Jesper Jensen

Audio and visual modalities are inherently connected in speech signals: lip movements and facial expressions are correlated with speech sounds.

speech-recognition Speech Recognition +1

VocaLiST: An Audio-Visual Synchronisation Model for Lips and Voices

1 code implementation5 Apr 2022 Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro

Finally, we use the frozen visual features learned by our lip synchronisation model in the singing voice separation task to outperform a baseline audio-visual model which was trained end-to-end.

Audio-Visual Synchronization Music Source Separation

Solos: A Dataset for Audio-Visual Music Analysis

1 code implementation14 Jun 2020 Juan F. Montesinos, Olga Slizovskaia, Gloria Haro

In this paper, we present a new dataset of music performance videos which can be used for training machine learning methods for multiple tasks such as audio-visual blind source separation and localization, cross-modal correspondences, cross-modal generation and, in general, any audio-visual selfsupervised task.

Audio Source Separation Audio-Visual Synchronization +1 Audio and Speech Processing Databases Sound

Multi-channel U-Net for Music Source Separation

2 code implementations23 Mar 2020 Venkatesh S. Kadandale, Juan F. Montesinos, Gloria Haro, Emilia Gómez

However, Conditioned U-Net (C-U-Net) uses a control mechanism to train a single model for multi-source separation and attempts to achieve a performance comparable to that of the dedicated models.

Music Source Separation

Cannot find the paper you are looking for? You can Submit a new open access paper.