Search Results for author: Yanyan Liang

Found 15 papers, 6 papers with code

CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

no code implementations • 21 Mar 2024 • Ajian Liu, Shuai Xue, Jianwen Gan, Jun Wan, Yanyan Liang, Jiankang Deng, Sergio Escalera, Zhen Lei

Specifically, we propose a novel Class Free Prompt Learning (CFPL) paradigm for DG FAS, which utilizes two lightweight transformers, namely Content Q-Former (CQF) and Style Q-Former (SQF), to learn the different semantic prompts conditioned on content and style features by using a set of learnable query vectors, respectively.

Domain Generalization Face Anti-Spoofing

Paper
Add Code

Distilling Temporal Knowledge with Masked Feature Reconstruction for 3D Object Detection

no code implementations • 3 Jan 2024 • Haowen Zheng, Dong Cao, Jintao Xu, Rui Ai, Weihao Gu, Yang Yang, Yanyan Liang

Ultimately, we utilize this reconstruction target to reconstruct the student features.

3D Object Detection Knowledge Distillation +1

Paper
Add Code

PMMTalk: Speech-Driven 3D Facial Animation from Complementary Pseudo Multi-modal Features

no code implementations • 5 Dec 2023 • Tianshun Han, Shengnan Gui, Yiqing Huang, Baihui Li, Lijian Liu, Benjia Zhou, Ning Jiang, Quan Lu, Ruicong Zhi, Yanyan Liang, Du Zhang, Jun Wan

The framework entails three modules: PMMTalk encoder, cross-modal alignment module, and PMMTalk decoder.

speech-recognition Speech Recognition +1

Paper
Add Code

Long-Range Grouping Transformer for Multi-View 3D Reconstruction

1 code implementation • ICCV 2023 • Liying Yang, Zhenwei Zhu, Xuxin Lin, Jian Nong, Yanyan Liang

The tokens in each group are sampled from all views and can provide macro representation for the resided view.

Ranked #1 on 3D Object Reconstruction on Data3D−R2N2

3D Object Reconstruction 3D Reconstruction +1

Paper
Code

Gloss-free Sign Language Translation: Improving from Visual-Language Pretraining

1 code implementation • ICCV 2023 • Benjia Zhou, Zhigang Chen, Albert Clapés, Jun Wan, Yanyan Liang, Sergio Escalera, Zhen Lei, Du Zhang

Many previous methods employ an intermediate representation, i. e., gloss sequences, to facilitate SLT, thus transforming it into a two-stage task of sign language recognition (SLR) followed by sign language translation (SLT).

Ranked #2 on Gloss-free Sign Language Translation on PHOENIX14T

Gloss-free Sign Language Translation Self-Supervised Learning +3

Paper
Code

FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing

no code implementations • 5 May 2023 • Ajian Liu, Zichang Tan, Zitong Yu, Chenxu Zhao, Jun Wan, Yanyan Liang, Zhen Lei, Du Zhang, Stan Z. Li, Guodong Guo

The availability of handy multi-modal (i. e., RGB-D) sensors has brought about a surge of face anti-spoofing research.

Face Anti-Spoofing Face Presentation Attack Detection

Paper
Add Code

MA-ViT: Modality-Agnostic Vision Transformers for Face Anti-Spoofing

no code implementations • 15 Apr 2023 • Ajian Liu, Yanyan Liang

The existing multi-modal face anti-spoofing (FAS) frameworks are designed based on two strategies: halfway and late fusion.

Face Anti-Spoofing

Paper
Add Code

UMIFormer: Mining the Correlations between Similar Tokens for Multi-View 3D Reconstruction

1 code implementation • ICCV 2023 • Zhenwei Zhu, Liying Yang, Ning li, Chaohao Jiang, Yanyan Liang

We empirically demonstrate on ShapeNet and confirm that our decoupled learning method is adaptable for unstructured multiple images.

Ranked #2 on 3D Object Reconstruction on Data3D−R2N2

3D Object Reconstruction 3D Reconstruction +1

Paper
Code

A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion Recognition

1 code implementation • 16 Nov 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang

Although improving motion recognition to some extent, these methods still face sub-optimal situations in the following aspects: (i) Data augmentation, i. e., the scale of the RGB-D datasets is still limited, and few efforts have been made to explore novel data augmentation strategies for videos; (ii) Optimization mechanism, i. e., the tightly space-time-entangled network structure brings more challenges to spatiotemporal information modeling; And (iii) cross-modal knowledge fusion, i. e., the high similarity between multimodal representations caused to insufficient late fusion.

Ranked #3 on Action Recognition on NTU RGB+D

Action Recognition Data Augmentation +2

Paper
Code

GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

1 code implementation • 4 Nov 2022 • Zhenwei Zhu, Liying Yang, Xuxin Lin, Chaohao Jiang, Ning li, Lin Yang, Yanyan Liang

Deep learning technology has made great progress in multi-view 3D reconstruction tasks.

3D Reconstruction Multi-View 3D Reconstruction

Paper
Code

Effective Vision Transformer Training: A Data-Centric Perspective

no code implementations • 29 Sep 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang

To achieve these two purposes, we propose a novel data-centric ViT training framework to dynamically measure the ``difficulty'' of training samples and generate ``effective'' samples for models at different training stages.

Paper
Add Code

Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition

1 code implementation • CVPR 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang, Du Zhang, Zhen Lei, Hao Li, Rong Jin

Decoupling spatiotemporal representation refers to decomposing the spatial and temporal features into dimension-independent factors.

Ranked #1 on Hand Gesture Recognition on NVGesture

Hand Gesture Recognition

Paper
Code

Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face Presentation Attack Detection

no code implementations • 13 Apr 2021 • Ajian Liu, Chenxu Zhao, Zitong Yu, Jun Wan, Anyang Su, Xing Liu, Zichang Tan, Sergio Escalera, Junliang Xing, Yanyan Liang, Guodong Guo, Zhen Lei, Stan Z. Li, Du Zhang

To bridge the gap to real-world applications, we introduce a largescale High-Fidelity Mask dataset, namely CASIA-SURF HiFiMask (briefly HiFiMask).

Face Presentation Attack Detection Face Recognition

Paper
Add Code

MxPool: Multiplex Pooling for Hierarchical Graph Representation Learning

no code implementations • ICLR 2020 • Yanyan Liang, Yanfeng Zhang, Dechao Gao, Qian Xu

This motivates us to use a multiplex structure in a diverse way and utilize a priori properties of graphs to guide the learning.

Clustering General Classification +3

Paper
Add Code

CASIA-SURF: A Large-scale Multi-modal Benchmark for Face Anti-spoofing

no code implementations • 28 Aug 2019 • Shifeng Zhang, Ajian Liu, Jun Wan, Yanyan Liang, Guogong Guo, Sergio Escalera, Hugo Jair Escalante, Stan Z. Li

To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIA-SURF, which is the largest publicly available dataset for face anti-spoofing in terms of both subjects and modalities.

Face Anti-Spoofing Face Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.