no code implementations • 6 Apr 2024 • Simindokht Jahangard, Zhixi Cai, Shiki Wen, Hamid Rezatofighi
Understanding human social behaviour is crucial in computer vision and robotics.
no code implementations • 19 Mar 2024 • Fucai Ke, Zhixi Cai, Simindokht Jahangard, Weiqing Wang, Pari Delir Haghighi, Hamid Rezatofighi
Recent advances in visual reasoning (VR), particularly with the aid of Large Vision-Language Models (VLMs), show promise but require access to large-scale datasets and face challenges such as high computational costs and limited generalization capabilities.
1 code implementation • 26 Nov 2023 • Zhixi Cai, Shreya Ghosh, Aman Pankaj Adatia, Munawar Hayat, Abhinav Dhall, Kalin Stefanov
The comprehensive benchmark of the proposed dataset utilizing state-of-the-art deepfake detection and localization methods indicates a significant drop in performance compared to previous datasets.
no code implementations • 10 May 2023 • Shreya Ghosh, Rakibul Hasan, Pradyumna Agrawal, Zhixi Cai, Susannah Soon, Abhinav Dhall, Tom Gedeon
To this end, we design a user interface to generate an automatic feedback mechanism that integrates Pavlok and a deep learning based model to detect certain behaviours via an integrated user interface i. e. mobile or desktop application.
1 code implementation • 3 May 2023 • Zhixi Cai, Shreya Ghosh, Abhinav Dhall, Tom Gedeon, Kalin Stefanov, Munawar Hayat
The proposed baseline method, Boundary Aware Temporal Forgery Detection (BA-TFD), is a 3D Convolutional Neural Network-based architecture which effectively captures multimodal manipulations.
Ranked #1 on Temporal Forgery Localization on ForgeryNet
1 code implementation • CVPR 2023 • Zhixi Cai, Shreya Ghosh, Kalin Stefanov, Abhinav Dhall, Jianfei Cai, Hamid Rezatofighi, Reza Haffari, Munawar Hayat
This paper proposes a self-supervised approach to learn universal facial representations from videos, that can transfer across a variety of facial analysis tasks such as Facial Attribute Recognition (FAR), Facial Expression Recognition (FER), DeepFake Detection (DFD), and Lip Synchronization (LS).
Ranked #1 on Emotion Classification on CMU-MOSEI
1 code implementation • 13 Apr 2022 • Zhixi Cai, Kalin Stefanov, Abhinav Dhall, Munawar Hayat
Our baseline method for benchmarking the proposed dataset is a 3DCNN model, termed as Boundary Aware Temporal Forgery Detection (BA-TFD), which is guided via contrastive, boundary matching, and frame classification loss functions.
Ranked #1 on DeepFake Detection on LAV-DF