Search Results for author: Tianzhu Ye

Found 5 papers, 3 papers with code

Agent Attention: On the Integration of Softmax and Linear Attention

2 code implementations • 14 Dec 2023 • Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Shiji Song, Gao Huang

Specifically, the Agent Attention, denoted as a quadruple $(Q, A, K, V)$, introduces an additional set of agent tokens $A$ into the conventional attention module.

Computational Efficiency Image Classification +4

350

Paper
Code

FAAC: Facial Animation Generation with Anchor Frame and Conditional Control for Superior Fidelity and Editability

no code implementations • 6 Dec 2023 • Linze Li, Sunqi Fan, Hengjun Pu, Zhaodong Bing, Yao Tang, Tianzhu Ye, Tong Yang, Liangyu Chen, Jiajun Liang

Our method's efficacy has been validated on multiple representative DreamBooth and LoRA models, delivering substantial improvements over the original outcomes in terms of facial fidelity, text-to-image editability, and video motion.

Face Model Video Generation

Paper
Add Code

Joint Token Pruning and Squeezing Towards More Aggressive Compression of Vision Transformers

1 code implementation • CVPR 2023 • Siyuan Wei, Tianzhu Ye, Shen Zhang, Yao Tang, Jiajun Liang

Experiments on various transformers demonstrate the effectiveness of our method, while analysis experiments prove our higher robustness to the errors of the token pruning policy.

Ranked #1 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs

Paper
Code

Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention

1 code implementation • CVPR 2023 • Xuran Pan, Tianzhu Ye, Zhuofan Xia, Shiji Song, Gao Huang

Self-attention mechanism has been a key factor in the recent progress of Vision Transformer (ViT), which enables adaptive feature extraction from global contexts.

feature selection Inductive Bias

149

Paper
Code

Contrastive Language-Image Pre-Training with Knowledge Graphs

no code implementations • 17 Oct 2022 • Xuran Pan, Tianzhu Ye, Dongchen Han, Shiji Song, Gao Huang

Recent years have witnessed the fast development of large-scale pre-training frameworks that can extract multi-modal representations in a unified form and achieve promising performances when transferred to downstream tasks.

Knowledge Graphs

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.