no code implementations • 15 Apr 2024 • Xinyu Xie, Yawen Cui, Chio-in Ieong, Tao Tan, Xiaozhi Zhang, Xubin Zheng, Zitong Yu
In this paper, we propose FusionMamba, a novel dynamic feature enhancement method for multimodal image fusion with Mamba.
no code implementations • 10 Apr 2024 • Changsheng chen, Yongyi Deng, Liangwei Lin, Zitong Yu, Zhimao Lai
Document Presentation Attack Detection (DPAD) is an important measure in protecting the authenticity of a document image.
no code implementations • 21 Mar 2024 • Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, Alex Kot
This is particularly true for medical image segmentation (MIS) datasets, where the processes of collection and fine-grained annotation are time-intensive and laborious.
1 code implementation • 11 Mar 2024 • Qilang Ye, Zitong Yu, Xin Liu
Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +3
2 code implementations • 9 Mar 2024 • Hao Lu, Xuesong Niu, Jiyao Wang, Yin Wang, Qingyong Hu, Jiaqi Tang, Yuting Zhang, Kaishen Yuan, Bin Huang, Zitong Yu, Dengbo He, Shuiguang Deng, Hao Chen, Yingcong Chen, Shiguang Shan
In conclusion, this paper provides valuable insights into the potential applications and challenges of MLLMs in human-centric computing.
1 code implementation • 7 Mar 2024 • Kaishen Yuan, Zitong Yu, Xin Liu, Weicheng Xie, Huanjing Yue, Jingyu Yang
Facial Action Units (AU) is a vital concept in the realm of affective computing, and AU detection has always been a hot research topic.
Ranked #1 on Facial Action Unit Detection on DISFA
1 code implementation • 7 Mar 2024 • Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao
This paper focuses on the challenge of answering questions in scenarios that are composed of rich and complex dynamic audio-visual components.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +5
no code implementations • 29 Feb 2024 • Chao Hao, Zitong Yu, Xin Liu, Jun Xu, Huanjing Yue, Jingyu Yang
Camouflaged object detection (COD) and salient object detection (SOD) are two distinct yet closely-related computer vision tasks widely studied during the past decades.
3 code implementations • 29 Feb 2024 • Xun Lin, Shuai Wang, Rizhao Cai, Yizhong Liu, Ying Fu, Zitong Yu, Wenzhong Tang, Alex Kot
Face Anti-Spoofing (FAS) is crucial for securing face recognition systems against presentation attacks.
2 code implementations • 6 Feb 2024 • Yichen Shi, Yuhao Gao, Yingxin Lai, Hongyang Wang, Jun Feng, Lei He, Jun Wan, Changsheng chen, Zitong Yu, Xiaochun Cao
For the face forgery detection task, we evaluate GAN-based and diffusion-based data with both visual and acoustic modalities.
no code implementations • 3 Feb 2024 • Yaning Zhang, Zitong Yu, Xiaobin Huang, Linlin Shen, Jianfeng Ren
In this paper, we propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection, which contains a large number of forgery faces generated by advanced generators such as the diffusion-based model and more detailed labels about the manipulation approaches and adopted generators.
1 code implementation • 28 Sep 2023 • Xun Lin, Wenzhong Tang, Haoran Wang, Yizhong Liu, Yakun Ju, Shuai Wang, Zitong Yu
Compared to image duplication and synthesis, image splicing detection is more challenging due to the lack of reference images and the typically small tampered areas.
2 code implementations • 7 Sep 2023 • Rizhao Cai, Zitong Yu, Chenqi Kong, Haoliang Li, Changsheng chen, Yongjian Hu, Alex Kot
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face recognition system by presenting spoofed faces.
no code implementations • 17 Aug 2023 • Shuangpeng Han, Rizhao Cai, Yawen Cui, Zitong Yu, Yongjian Hu, Alex Kot
To further improve generalization, we conduct hyperbolic contrastive learning for the bonafide only while relaxing the constraints on diverse spoofing attacks.
2 code implementations • 15 Aug 2023 • Xin Liu, Kaishen Yuan, Xuesong Niu, Jingang Shi, Zitong Yu, Huanjing Yue, Jingyu Yang
Anatomically, there are innumerable correlations between AUs, which contain rich information and are vital for AU detection.
Ranked #2 on Facial Action Unit Detection on DISFA
no code implementations • 26 Jul 2023 • Zitong Yu, Rizhao Cai, Yawen Cui, Ajian Liu, Changsheng chen
Recently, vision transformer based multimodal learning methods have been proposed to improve the robustness of face anti-spoofing (FAS) systems.
1 code implementation • 29 Jun 2023 • Yingxin Lai, Zhiming Luo, Zitong Yu
The rapid advancements in computer vision have stimulated remarkable progress in face forgery techniques, capturing the dedicated attention of researchers committed to detecting forgeries and precisely localizing manipulated areas.
no code implementations • 7 Jun 2023 • Md Ashequr Rahman, Zitong Yu, Richard Laforest, Craig K. Abbey, Barry A. Siegel, Abhinav K. Jha
There is an important need for methods to process myocardial perfusion imaging (MPI) SPECT images acquired at lower radiation dose and/or acquisition time such that the processed images improve observer performance on the clinical task of detecting perfusion defects.
1 code implementation • 4 Jun 2023 • Xin Liu, Yuting Zhang, Zitong Yu, Hao Lu, Huanjing Yue, Jingyu Yang
However, they focus on the contrastive learning between samples, which neglect the inherent self-similar prior in physiological signals and seem to have a limited ability to cope with noisy.
no code implementations • 5 May 2023 • Ajian Liu, Zichang Tan, Zitong Yu, Chenxu Zhao, Jun Wan, Yanyan Liang, Zhen Lei, Du Zhang, Stan Z. Li, Guodong Guo
The availability of handy multi-modal (i. e., RGB-D) sensors has brought about a surge of face anti-spoofing research.
no code implementations • ICCV 2023 • Rizhao Cai, Yawen Cui, Zhi Li, Zitong Yu, Haoliang Li, Yongjian Hu, Alex Kot
To alleviate the forgetting of previous domains without using previous data, we propose the Proxy Prototype Contrastive Regularization (PPCR) to constrain the continual learning with previous domain knowledge from the proxy prototypes.
1 code implementation • CVPR 2023 • Hao Lu, Zitong Yu, Xuesong Niu, Yingcong Chen
We show that most domain generalization methods do not work well in this problem, as domain labels are ambiguous in complicated environmental changes.
1 code implementation • ICCV 2023 • Xiaobao Guo, Nithish Muthuchamy Selvaraj, Zitong Yu, Adams Wai-Kin Kong, Bingquan Shen, Alex Kot
Despite this, deception detection research is hindered by the lack of high-quality deception datasets, as well as the difficulties of learning multimodal features effectively.
no code implementations • 3 Mar 2023 • Zitong Yu, Md Ashequr Rahman, Richard Laforest, Thomas H. Schindler, Robert J. Gropler, Richard L. Wahl, Barry A. Siegel, Abhinav K. Jha
Our objectives were to (1) investigate whether evaluation with these FoMs is consistent with objective clinical-task-based evaluation; (2) provide a theoretical analysis for determining the impact of denoising on signal-detection tasks; (3) demonstrate the utility of virtual clinical trials (VCTs) to evaluate DL-based methods.
no code implementations • 1 Mar 2023 • Md Ashequr Rahman, Zitong Yu, Barry A. Siegel, Abhinav K. Jha
However, while promising, studies have shown that these methods may have limited impact on the performance of clinical tasks in SPECT.
1 code implementation • 12 Feb 2023 • Yawen Cui, Zitong Yu, Rizhao Cai, Xun Wang, Alex C. Kot, Li Liu
The goal of Few-Shot Continual Learning (FSCL) is to incrementally learn novel tasks with limited labeled samples and preserve previous capabilities simultaneously, while current FSCL methods are all for the class-incremental purpose.
no code implementations • 11 Feb 2023 • Zhaoxu Li, Zitong Yu, Nithish Muthuchamy Selvaraj, Xiaobao Guo, Bingquan Shen, Adams Wai-Kin Kong, Alex Kot
Detecting deception by human behaviors is vital in many fields such as custom security and multimedia anti-fraud.
no code implementations • 11 Feb 2023 • Zitong Yu, Rizhao Cai, Yawen Cui, Xin Liu, Yongjian Hu, Alex Kot
In this paper, we investigate three key factors (i. e., inputs, pre-training, and finetuning) in ViT for multimodal FAS with RGB, Infrared (IR), and Depth.
no code implementations • 7 Feb 2023 • Zitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Yawen Cui, Jiehua Zhang, Philip Torr, Guoying Zhao
As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference.
no code implementations • 7 Dec 2022 • Zitong Yu, Chenxu Zhao, Zhen Lei
Face recognition technology has been widely used in daily interactive applications such as checking-in and mobile payment due to its convenience and high accuracy.
1 code implementation • 30 Nov 2022 • Jianwei Li, Zitong Yu, Jingang Shi
Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) estimation from facial videos which gives significant convenience compared with traditional contact-based measurements.
no code implementations • 4 Nov 2022 • Jiehua Zhang, Xueyang Zhang, Zhuo Su, Zitong Yu, Yanghe Feng, Xin Lu, Matti Pietikäinen, Li Liu
For ViTs, DyBinaryCCT presents the superiority of the convolutional embedding layer in fully binarized ViTs and achieves 56. 1% on the ImageNet dataset, which is nearly 9% higher than the baseline.
no code implementations • 5 Sep 2022 • Changsheng chen, Lin Zhao, Rizhao Cai, Zitong Yu, Jiwu Huang, Alex C. Kot
We integrate the trained FANet with practical recapturing detection schemes in face anti-spoofing and recaptured document detection tasks.
no code implementations • 10 Aug 2022 • Zitong Yu, Rizhao Cai, Zhi Li, Wenhan Yang, Jingang Shi, Alex C. Kot
In this paper, we establish the first joint face spoofing and forgery detection benchmark using both visual appearance and physiological rPPG cues.
no code implementations • 20 Jul 2022 • Yawen Cui, Zitong Yu, Wei Peng, Li Liu
Few-Shot Class-Incremental Learning (FSCIL) aims at incrementally learning novel classes from a few labeled samples by avoiding the overfitting and catastrophic forgetting simultaneously.
1 code implementation • CVPR 2022 • Zhuo Wang, Zezheng Wang, Zitong Yu, Weihong Deng, Jiahong Li, Tingting Gao, Zhongyuan Wang
A novel Shuffled Style Assembly Network (SSAN) is proposed to extract and reassemble different content and style features for a stylized feature space.
no code implementations • 3 Mar 2022 • Zuheng Ming, Zitong Yu, Musab Al-Ghadi, Muriel Visani, Muhammad MuzzamilLuqman, Jean-Christophe Burie
Instead of using coarse image patches with single-scale as in ViT, we propose the Multi-scale Multi-Head Self-Attention (MsMHSA) architecture to accommodate multi-scale patch partitions of Q, K, V feature maps to the heads of transformer in a coarse-to-fine manner, which enables to learn a fine-grained representation to perform pixel-level discrimination for face PAD.
no code implementations • 3 Mar 2022 • Zitong Yu, Md Ashequr Rahman, Abhinav K. Jha
To achieve this goal, we conducted a task-based characterization of a DL-based denoising approach for individual signal properties.
1 code implementation • 16 Feb 2022 • Zitong Yu, Ajian Liu, Chenxu Zhao, Kevin H. M. Cheng, Xu Cheng, Guoying Zhao
Can we train a unified model, and flexibly deploy it under various modality scenarios?
no code implementations • 21 Dec 2021 • Zitong Yu, Jukka Komulainen, Xiaobai Li, Guoying Zhao
Face presentation attack detection (PAD) has received increasing attention ever since the vulnerabilities to spoofing have been widely recognized.
1 code implementation • 14 Dec 2021 • Haoyu Chen, Hao Tang, Zitong Yu, Nicu Sebe, Guoying Zhao
Specifically, we propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies across the given meshes.
1 code implementation • 24 Nov 2021 • Zezheng Wang, Zitong Yu, Xun Wang, Yunxiao Qin, Jiahong Li, Chenxu Zhao, Zhen Lei, Xin Liu, Size Li, Zhongyuan Wang
Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems.
1 code implementation • CVPR 2022 • Zitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Philip Torr, Guoying Zhao
Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e. g., remote healthcare and affective computing).
no code implementations • 12 Nov 2021 • Yunxiao Qin, Zitong Yu, Longbin Yan, Zezheng Wang, Chenxu Zhao, Zhen Lei
The meta-teacher is trained in a bi-level optimization manner to learn the ability to supervise the PA detectors learning rich spoofing cues.
no code implementations • 16 Aug 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Anyang Su, Xing Liu, Zijian Kong, Jun Wan, Sergio Escalera, Hugo Jair Escalante, Zhen Lei, Guodong Guo
The threat of 3D masks to face recognition systems is increasingly serious and has been widely concerned by researchers.
2 code implementations • ICCV 2021 • Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu
A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.
Ranked #2 on Edge Detection on BRIND
1 code implementation • CVPR 2021 • Xin Liu, Henglin Shi, Haoyu Chen, Zitong Yu, Xiaobai Li, Guoying Zhaoz?
We introduce a new dataset for the emotional artificial intelligence research: identity-free video dataset for Micro-Gesture Understanding and Emotion analysis (iMiGUE).
3 code implementations • 28 Jun 2021 • Zitong Yu, Yunxiao Qin, Xiaobai Li, Chenxu Zhao, Zhen Lei, Guoying Zhao
Face anti-spoofing (FAS) has lately attracted increasing attention due to its vital role in securing face recognition systems from presentation attacks (PAs).
no code implementations • 18 May 2021 • Ruijing Yang, Ziyu Guan, Zitong Yu, Xiaoyi Feng, Jinye Peng, Guoying Zhao
The framework is able to capture both local and long-range dependencies via the proposed attention mechanism for the learned appearance representations, which are further enriched by temporally attended physiological cues (remote photoplethysmography, rPPG) that are recovered from videos in the auxiliary task.
1 code implementation • 4 May 2021 • Zitong Yu, Yunxiao Qin, Hengshuang Zhao, Xiaobai Li, Guoying Zhao
In this paper, we propose two Cross Central Difference Convolutions (C-CDC), which exploit the difference of the center and surround sparse local features from the horizontal/vertical and diagonal directions, respectively.
no code implementations • 15 Apr 2021 • Zitong Yu, Xiaobai Li, Pichao Wang, Guoying Zhao
3D mask face presentation attack detection (PAD) plays a vital role in securing face recognition systems from emergent 3D mask attacks.
no code implementations • 13 Apr 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Jun Wan, Anyang Su, Xing Liu, Zichang Tan, Sergio Escalera, Junliang Xing, Yanyan Liang, Guodong Guo, Zhen Lei, Stan Z. Li, Du Zhang
To bridge the gap to real-world applications, we introduce a largescale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask).
no code implementations • 10 Feb 2021 • Zitong Yu, Md Ashequr Rahman, Thomas Schindler, Richard Laforest, Abhinav K. Jha
The proposed method uses data acquired in the scatter window to reconstruct an initial estimate of the attenuation map using a physics-based approach.
Medical Physics
1 code implementation • 24 Nov 2020 • Zitong Yu, Xiaobai Li, Jingang Shi, Zhaoqiang Xia, Guoying Zhao
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems from the presentation attacks (PAs).
no code implementations • 3 Nov 2020 • Zitong Yu, Jun Wan, Yunxiao Qin, Xiaobai Li, Stan Z. Li, Guoying Zhao
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems.
1 code implementation • 21 Aug 2020 • Zitong Yu, Benjia Zhou, Jun Wan, Pichao Wang, Haoyu Chen, Xin Liu, Stan Z. Li, Guoying Zhao
Gesture recognition has attracted considerable attention owing to its great potential in applications.
no code implementations • 10 Aug 2020 • Haoyu Chen, Zitong Yu, Xin Liu, Wei Peng, Yoon Lee, Guoying Zhao
To address the problem of training on small datasets for action recognition tasks, most prior works are either based on a large number of training samples or require pre-trained models transferred from other large datasets to tackle overfitting problems.
1 code implementation • ECCV 2020 • Xuesong Niu, Zitong Yu, Hu Han, Xiaobai Li, Shiguang Shan, Guoying Zhao
Remote physiological measurements, e. g., remote photoplethysmography (rPPG) based heart rate (HR), heart rate variability (HRV) and respiration frequency (RF) measuring, are playing more and more important roles under the application scenarios where contact measurement is inconvenient or impossible.
no code implementations • ECCV 2020 • Zitong Yu, Xiaobai Li, Xuesong Niu, Jingang Shi, Guoying Zhao
In this paper we rephrase face anti-spoofing as a material recognition problem and combine it with classical human material perception [1], intending to extract discriminative and robust features for FAS.
no code implementations • 26 Apr 2020 • Zitong Yu, Xiaobai Li, Xuesong Niu, Jingang Shi, Guoying Zhao
Remote photoplethysmography (rPPG), which aims at measuring heart activities without any contact, has great potential in many applications (e. g., remote healthcare).
1 code implementation • 17 Apr 2020 • Zitong Yu, Yunxiao Qin, Xiaobai Li, Zezheng Wang, Chenxu Zhao, Zhen Lei, Guoying Zhao
Face anti-spoofing (FAS) plays a vital role in securing face recognition systems from presentation attacks.
no code implementations • 26 Mar 2020 • Xiaobai Li, Hu Han, Hao Lu, Xuesong Niu, Zitong Yu, Antitza Dantcheva, Guoying Zhao, Shiguang Shan
Remote measurement of physiological signals from videos is an emerging topic.
6 code implementations • CVPR 2020 • Zezheng Wang, Zitong Yu, Chenxu Zhao, Xiangyu Zhu, Yunxiao Qin, Qiusheng Zhou, Feng Zhou, Zhen Lei
Depth supervised learning has been proven as one of the most effective methods for face anti-spoofing.
6 code implementations • CVPR 2020 • Zitong Yu, Chenxu Zhao, Zezheng Wang, Yunxiao Qin, Zhuo Su, Xiaobai Li, Feng Zhou, Guoying Zhao
Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC), which is able to capture intrinsic detailed patterns via aggregating both intensity and gradient information.
Ranked #4 on Face Anti-Spoofing on OULU-NPU
2 code implementations • ICCV 2019 • Zitong Yu, Wei Peng, Xiaobai Li, Xiaopeng Hong, Guoying Zhao
The method includes two parts: 1) a Spatio-Temporal Video Enhancement Network (STVEN) for video enhancement, and 2) an rPPG network (rPPGNet) for rPPG signal recovery.
Heart rate estimation Photoplethysmography (PPG) heart rate estimation +2
2 code implementations • 7 May 2019 • Zitong Yu, Xiaobai Li, Guoying Zhao
Recent studies demonstrated that the average heart rate (HR) can be measured from facial videos based on non-contact remote photoplethysmography (rPPG).
no code implementations • 29 Apr 2019 • Yunxiao Qin, Chenxu Zhao, Xiangyu Zhu, Zezheng Wang, Zitong Yu, Tianyu Fu, Feng Zhou, Jingping Shi, Zhen Lei
Therefore, we define face anti-spoofing as a zero- and few-shot learning problem.
no code implementations • 31 Mar 2019 • Hui Li, Meng Yang, Zhihui Lai, Wei-Shi Zheng, Zitong Yu
Deep part-based methods in recent literature have revealed the great potential of learning local part-level representation for pedestrian image in the task of person re-identification.