Search Results for author: Dan Bigioi

Found 5 papers, 0 papers with code

LatentColorization: Latent Diffusion-Based Speaker Video Colorization

no code implementations9 May 2024 Rory Ward, Dan Bigioi, Shubhajit Basak, John G. Breslin, Peter Corcoran

While current research predominantly focuses on image-based colorization, the domain of video-based colorization remains relatively unexplored.

Colorization

Synthetic Speaking Children -- Why We Need Them and How to Make Them

no code implementations8 Nov 2023 Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran

Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user.

Speech Driven Video Editing via an Audio-Conditioned Diffusion Model

no code implementations10 Jan 2023 Dan Bigioi, Shubhajit Basak, Michał Stypułkowski, Maciej Zięba, Hugh Jordan, Rachel McDonnell, Peter Corcoran

Taking inspiration from recent developments in visual generative tasks using diffusion models, we propose a method for end-to-end speech-driven video editing using a denoising diffusion model.

Denoising Face Model +2

A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition

no code implementations6 Apr 2022 Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu

Our models outperformed the wav2vec2 BASE 960 on child speech which is considered a state-of-the-art ASR model on adult speech by just using 10 hours of child speech data in finetuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Cannot find the paper you are looking for? You can Submit a new open access paper.