1 code implementation • 7 Mar 2024 • Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao
This paper focuses on the challenge of answering questions in scenarios that are composed of rich and complex dynamic audio-visual components.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +5
no code implementations • 12 Jan 2024 • Zaijing Li, Gongwei Chen, Rui Shao, Dongmei Jiang, Liqiang Nie
In this paper, we propose the Emotional Chain-of-Thought (ECoT), a plug-and-play prompting method that enhances the performance of LLMs on various emotional generation tasks by aligning with human emotional intelligence guidelines.
1 code implementation • 20 Nov 2023 • Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie
1) Progressive incorporation of fine-grained spatial-aware visual knowledge.
1 code implementation • 26 Sep 2023 • Rui Shao, Tianxing Wu, Ziwei Liu
However, existing methods only focus on detecting one-step facial manipulation.
1 code implementation • 25 Sep 2023 • Rui Shao, Tianxing Wu, Jianlong Wu, Liqiang Nie, Ziwei Liu
HAMMER performs 1) manipulation-aware contrastive learning between two uni-modal encoders as shallow manipulation reasoning, and 2) modality-aware cross-attention by multi-modal aggregator as deep manipulation reasoning.
1 code implementation • 1 Jun 2023 • Rui Shao, Tianxing Wu, Liqiang Nie, Ziwei Liu
Unlike existing deepfake detection methods merely focusing on low-level forgery patterns, the forgery detection process of our model can be regularized by generalizable high-level semantics from a pre-trained ViT and adapted by global and local low-level forgeries of deepfake data.
1 code implementation • CVPR 2023 • Rui Shao, Tianxing Wu, Ziwei Liu
In this paper, we highlight a new research problem for multi-modal fake media, namely Detecting and Grounding Multi-Modal Media Manipulation (DGM^4).
no code implementations • 4 Oct 2022 • Bochao Zhang, Rui Shao, Jingda Du, PC Yuen
Firstly, it will lead to overfitting to the test-time procedure thus hurt the performance on the main task.
1 code implementation • 5 Jul 2022 • Rui Shao, Tianxing Wu, Ziwei Liu
Moreover, we build a comprehensive benchmark and set up rigorous evaluation protocols and metrics for this new research problem.
1 code implementation • 12 Feb 2022 • Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel
This paper proposes an Open-Set Defense Network with Clean-Adversarial Mutual Learning (OSDN-CAML) as a solution to the OSAD problem.
no code implementations • 25 Oct 2021 • Rui Shao, Bochao Zhang, Pong C. Yuen, Vishal M. Patel
The generalization ability of face presentation attack detection models to unseen attacks has become a key issue for real-world deployment, which can be improved when models are trained with face images from different input distributions and different types of spoof attacks.
no code implementations • 14 Apr 2021 • Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel
A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks.
1 code implementation • ECCV 2020 • Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel
In this paper, we show that open-set recognition systems are vulnerable to adversarial attacks.
no code implementations • 29 May 2020 • Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel
A face presentation attack detection model with good generalization can be obtained when it is trained with face images from different input distributions and different types of spoof attacks.
1 code implementation • 25 Nov 2019 • Rui Shao, Xiangyuan Lan, Pong C. Yuen
Besides, to further enhance the generalization ability of our model, the proposed framework adopts a fine-grained learning strategy that simultaneously conducts meta-learning in a variety of domain shift scenarios in each iteration.
1 code implementation • CVPR 2019 • Rui Shao, Xiangyuan Lan, Jiawei Li, Pong C. Yuen
This work focuses on improving the generalization ability of face anti-spoofing methods from the perspective of the domain generalization.