no code implementations • 21 Mar 2024 • Ajian Liu, Shuai Xue, Jianwen Gan, Jun Wan, Yanyan Liang, Jiankang Deng, Sergio Escalera, Zhen Lei
Specifically, we propose a novel Class Free Prompt Learning (CFPL) paradigm for DG FAS, which utilizes two lightweight transformers, namely Content Q-Former (CQF) and Style Q-Former (SQF), to learn the different semantic prompts conditioned on content and style features by using a set of learnable query vectors, respectively.
no code implementations • 3 Jan 2024 • Haowen Zheng, Dong Cao, Jintao Xu, Rui Ai, Weihao Gu, Yang Yang, Yanyan Liang
Ultimately, we utilize this reconstruction target to reconstruct the student features.
no code implementations • 5 Dec 2023 • Tianshun Han, Shengnan Gui, Yiqing Huang, Baihui Li, Lijian Liu, Benjia Zhou, Ning Jiang, Quan Lu, Ruicong Zhi, Yanyan Liang, Du Zhang, Jun Wan
The framework entails three modules: PMMTalk encoder, cross-modal alignment module, and PMMTalk decoder.
1 code implementation • ICCV 2023 • Liying Yang, Zhenwei Zhu, Xuxin Lin, Jian Nong, Yanyan Liang
The tokens in each group are sampled from all views and can provide macro representation for the resided view.
Ranked #1 on 3D Object Reconstruction on Data3D−R2N2
1 code implementation • ICCV 2023 • Benjia Zhou, Zhigang Chen, Albert Clapés, Jun Wan, Yanyan Liang, Sergio Escalera, Zhen Lei, Du Zhang
Many previous methods employ an intermediate representation, i. e., gloss sequences, to facilitate SLT, thus transforming it into a two-stage task of sign language recognition (SLR) followed by sign language translation (SLT).
Ranked #2 on Gloss-free Sign Language Translation on PHOENIX14T
Gloss-free Sign Language Translation Self-Supervised Learning +3
no code implementations • 5 May 2023 • Ajian Liu, Zichang Tan, Zitong Yu, Chenxu Zhao, Jun Wan, Yanyan Liang, Zhen Lei, Du Zhang, Stan Z. Li, Guodong Guo
The availability of handy multi-modal (i. e., RGB-D) sensors has brought about a surge of face anti-spoofing research.
no code implementations • 15 Apr 2023 • Ajian Liu, Yanyan Liang
The existing multi-modal face anti-spoofing (FAS) frameworks are designed based on two strategies: halfway and late fusion.
1 code implementation • ICCV 2023 • Zhenwei Zhu, Liying Yang, Ning li, Chaohao Jiang, Yanyan Liang
We empirically demonstrate on ShapeNet and confirm that our decoupled learning method is adaptable for unstructured multiple images.
Ranked #2 on 3D Object Reconstruction on Data3D−R2N2
1 code implementation • 16 Nov 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang
Although improving motion recognition to some extent, these methods still face sub-optimal situations in the following aspects: (i) Data augmentation, i. e., the scale of the RGB-D datasets is still limited, and few efforts have been made to explore novel data augmentation strategies for videos; (ii) Optimization mechanism, i. e., the tightly space-time-entangled network structure brings more challenges to spatiotemporal information modeling; And (iii) cross-modal knowledge fusion, i. e., the high similarity between multimodal representations caused to insufficient late fusion.
Ranked #3 on Action Recognition on NTU RGB+D
1 code implementation • 4 Nov 2022 • Zhenwei Zhu, Liying Yang, Xuxin Lin, Chaohao Jiang, Ning li, Lin Yang, Yanyan Liang
Deep learning technology has made great progress in multi-view 3D reconstruction tasks.
no code implementations • 29 Sep 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang
To achieve these two purposes, we propose a novel data-centric ViT training framework to dynamically measure the ``difficulty'' of training samples and generate ``effective'' samples for models at different training stages.
1 code implementation • CVPR 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang, Du Zhang, Zhen Lei, Hao Li, Rong Jin
Decoupling spatiotemporal representation refers to decomposing the spatial and temporal features into dimension-independent factors.
Ranked #1 on Hand Gesture Recognition on NVGesture
no code implementations • 13 Apr 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Jun Wan, Anyang Su, Xing Liu, Zichang Tan, Sergio Escalera, Junliang Xing, Yanyan Liang, Guodong Guo, Zhen Lei, Stan Z. Li, Du Zhang
To bridge the gap to real-world applications, we introduce a largescale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask).
no code implementations • ICLR 2020 • Yanyan Liang, Yanfeng Zhang, Dechao Gao, Qian Xu
This motivates us to use a multiplex structure in a diverse way and utilize a priori properties of graphs to guide the learning.
no code implementations • 28 Aug 2019 • Shifeng Zhang, Ajian Liu, Jun Wan, Yanyan Liang, Guogong Guo, Sergio Escalera, Hugo Jair Escalante, Stan Z. Li
To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities.