Search Results for author: Dan Bigioi

Found 5 papers, 0 papers with code

LatentColorization: Latent Diffusion-Based Speaker Video Colorization

no code implementations • 9 May 2024 • Rory Ward, Dan Bigioi, Shubhajit Basak, John G. Breslin, Peter Corcoran

While current research predominantly focuses on image-based colorization, the domain of video-based colorization remains relatively unexplored.

Colorization

Paper
Add Code

Synthetic Speaking Children -- Why We Need Them and How to Make Them

no code implementations • 8 Nov 2023 • Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran

Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user.

Paper
Add Code

Speech Driven Video Editing via an Audio-Conditioned Diffusion Model

no code implementations • 10 Jan 2023 • Dan Bigioi, Shubhajit Basak, Michał Stypułkowski, Maciej Zięba, Hugh Jordan, Rachel McDonnell, Peter Corcoran

Taking inspiration from recent developments in visual generative tasks using diffusion models, we propose a method for end-to-end speech-driven video editing using a denoising diffusion model.

Denoising Face Model +2

Paper
Add Code

A Wav2vec2-Based Experimental Study on Self-Supervised Learning Methods to Improve Child Speech Recognition

no code implementations • 6 Apr 2022 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu

Our models outperformed the wav2vec2 BASE 960 on child speech which is considered a state-of-the-art ASR model on adult speech by just using 10 hours of child speech data in finetuning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

A Text-to-Speech Pipeline, Evaluation Methodology, and Initial Fine-Tuning Results for Child Speech Synthesis

no code implementations • 22 Mar 2022 • Rishabh Jain, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu

Speech synthesis has come a long way as current text-to-speech (TTS) models can now generate natural human-sounding speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.