Search Results for author: Konstantinos Vougioukas

Found 11 papers, 4 papers with code

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models

1 code implementation • 15 May 2023 • Antoni Bigata Casademunt, Rodrigo Mira, Nikita Drobyshev, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Speech-driven animation has gained significant traction in recent years, with current methods achieving near-photorealistic results.

Face Generation

Paper
Code

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision

no code implementations • CVPR 2023 • Xubo Liu, Egor Lakomkin, Konstantinos Vougioukas, Pingchuan Ma, Honglie Chen, Ruiming Xie, Morrie Doulaty, Niko Moritz, Jáchym Kolář, Stavros Petridis, Maja Pantic, Christian Fuegen

Furthermore, when combined with large-scale pseudo-labeled audio-visual data SynthVSR yields a new state-of-the-art VSR WER of 16. 9% using publicly available data only, surpassing the recent state-of-the-art approaches trained with 29 times more non-public machine-transcribed video data (90, 000 hours).

Lip Reading speech-recognition +1

Paper
Add Code

Diffused Heads: Diffusion Models Beat GANs on Talking-Face Generation

no code implementations • 6 Jan 2023 • Michał Stypułkowski, Konstantinos Vougioukas, Sen He, Maciej Zięba, Stavros Petridis, Maja Pantic

Talking face generation has historically struggled to produce head movements and natural facial expressions without guidance from additional reference videos.

Talking Face Generation Video Generation

Paper
Add Code

End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks

no code implementations • 27 Apr 2021 • Rodrigo Mira, Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Björn W. Schuller, Maja Pantic

In this work, we propose a new end-to-end video-to-speech model based on Generative Adversarial Networks (GANs) which translates spoken video to waveform end-to-end without using any intermediate representation or separate waveform synthesis algorithm.

Lip Reading Speech Synthesis

Paper
Add Code

DINO: A Conditional Energy-Based GAN for Domain Translation

1 code implementation • ICLR 2021 • Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Domain translation is the process of transforming data from one domain to another while preserving the common semantics.

Translation

Paper
Code

Lips Don't Lie: A Generalisable and Robust Approach to Face Forgery Detection

1 code implementation • CVPR 2021 • Alexandros Haliassos, Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

Extensive experiments show that this simple approach significantly surpasses the state-of-the-art in terms of generalisation to unseen manipulations and robustness to perturbations, as well as shed light on the factors responsible for its performance.

Ranked #5 on DeepFake Detection on FakeAVCeleb

DeepFake Detection Lipreading +2

110

Paper
Code

Visually Guided Self Supervised Learning of Speech Representations

no code implementations • 13 Jan 2020 • Abhinav Shukla, Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Maja Pantic

Self supervised representation learning has recently attracted a lot of research interest for both the audio and visual modalities.

Ranked #8 on Speech Emotion Recognition on CREMA-D

Representation Learning Self-Supervised Learning +3

Paper
Add Code

Speech-driven facial animation using polynomial fusion of features

no code implementations • 12 Dec 2019 • Triantafyllos Kefalas, Konstantinos Vougioukas, Yannis Panagakis, Stavros Petridis, Jean Kossaifi, Maja Pantic

Speech-driven facial animation involves using a speech signal to generate realistic videos of talking faces.

Tensor Decomposition

Paper
Add Code

Video-Driven Speech Reconstruction using Generative Adversarial Networks

no code implementations • 14 Jun 2019 • Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Maja Pantic

Speech is a means of communication which relies on both audio and visual information.

Paper
Add Code

Realistic Speech-Driven Facial Animation with GANs

no code implementations • 14 Jun 2019 • Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features.

Audio-Visual Synchronization Lip Reading

Paper
Add Code

End-to-End Speech-Driven Facial Animation with Temporal GANs

1 code implementation • 23 May 2018 • Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

To the best of our knowledge, this is the first method capable of generating subject independent realistic videos directly from raw audio.

Lip Reading

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.