no code implementations • 9 May 2024 • Rory Ward, Dan Bigioi, Shubhajit Basak, John G. Breslin, Peter Corcoran
While current research predominantly focuses on image-based colorization, the domain of video-based colorization remains relatively unexplored.
no code implementations • 8 Nov 2023 • Muhammad Ali Farooq, Dan Bigioi, Rishabh Jain, Wang Yao, Mariam Yiwere, Peter Corcoran
Contemporary Human Computer Interaction (HCI) research relies primarily on neural network models for machine vision and speech understanding of a system user.
no code implementations • 10 Jan 2023 • Dan Bigioi, Shubhajit Basak, Michał Stypułkowski, Maciej Zięba, Hugh Jordan, Rachel McDonnell, Peter Corcoran
Taking inspiration from recent developments in visual generative tasks using diffusion models, we propose a method for end-to-end speech-driven video editing using a denoising diffusion model.
no code implementations • 6 Apr 2022 • Rishabh Jain, Andrei Barcovschi, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu
Our models outperformed the wav2vec2 BASE 960 on child speech which is considered a state-of-the-art ASR model on adult speech by just using 10 hours of child speech data in finetuning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Mar 2022 • Rishabh Jain, Mariam Yiwere, Dan Bigioi, Peter Corcoran, Horia Cucu
Speech synthesis has come a long way as current text-to-speech (TTS) models can now generate natural human-sounding speech.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3