Search Results for author: Dongchen Han

Found 8 papers, 4 papers with code

VL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language Models

no code implementations • 21 Feb 2024 • Jiawei Liang, Siyuan Liang, Man Luo, Aishan Liu, Dongchen Han, Ee-Chien Chang, Xiaochun Cao

Nevertheless, the frozen visual encoder in autoregressive VLMs imposes constraints on the learning of conventional image triggers.

Backdoor Attack Few-Shot Learning +1

Paper
Add Code

GSVA: Generalized Segmentation via Multimodal Large Language Models

no code implementations • 15 Dec 2023 • Zhuofan Xia, Dongchen Han, Yizeng Han, Xuran Pan, Shiji Song, Gao Huang

Generalized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image.

Generalized Referring Expression Segmentation Referring Expression +1

Paper
Add Code

Agent Attention: On the Integration of Softmax and Linear Attention

2 code implementations • 14 Dec 2023 • Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Shiji Song, Gao Huang

Specifically, the Agent Attention, denoted as a quadruple $(Q, A, K, V)$, introduces an additional set of agent tokens $A$ into the conventional attention module.

Computational Efficiency Image Classification +4

342

Paper
Code

OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization

no code implementations • 7 Dec 2023 • Dongchen Han, Xiaojun Jia, Yang Bai, Jindong Gu, Yang Liu, Xiaochun Cao

Investigating the generation of high-transferability adversarial examples is crucial for uncovering VLP models' vulnerabilities in practical scenarios.

Adversarial Attack Data Augmentation +2

Paper
Add Code

FLatten Transformer: Vision Transformer using Focused Linear Attention

1 code implementation • ICCV 2023 • Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang

The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks.

337

Paper
Code

Dynamic Perceiver for Efficient Visual Recognition

1 code implementation • ICCV 2023 • Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang, Xuran Pan, Yifan Pu, Chao Deng, Junlan Feng, Shiji Song, Gao Huang

Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.

Action Recognition Classification +4

Paper
Code

Contrastive Language-Image Pre-Training with Knowledge Graphs

no code implementations • 17 Oct 2022 • Xuran Pan, Tianzhu Ye, Dongchen Han, Shiji Song, Gao Huang

Recent years have witnessed the fast development of large-scale pre-training frameworks that can extract multi-modal representations in a unified form and achieve promising performances when transferred to downstream tasks.

Knowledge Graphs

Paper
Add Code

Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding

1 code implementation • CVPR 2022 • Haojun Jiang, Yuanze Lin, Dongchen Han, Shiji Song, Gao Huang

Our method leverages an off-the-shelf object detector to identify visual objects from unlabeled images, and then language queries for these objects are obtained in an unsupervised fashion with a pseudo-query generation module.

Language Modelling Natural Language Queries +1

137

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.