Search Results for author: Jinguo Zhu

Found 9 papers, 7 papers with code

SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation

1 code implementation22 Apr 2024 Yuying Ge, Sijie Zhao, Jinguo Zhu, Yixiao Ge, Kun Yi, Lin Song, Chen Li, Xiaohan Ding, Ying Shan

We hope that our work will inspire future research into what can be achieved by versatile multimodal foundation models in real-world applications.

Image Generation

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

1 code implementation14 Dec 2023 Jinguo Zhu, Xiaohan Ding, Yixiao Ge, Yuying Ge, Sijie Zhao, Hengshuang Zhao, Xiaohua Wang, Ying Shan

In combination with the existing text tokenizer and detokenizer, this framework allows for the encoding of interleaved image-text data into a multimodal sequence, which can subsequently be fed into the transformer model.

Image Captioning In-Context Learning +4

VLATTACK: Multimodal Adversarial Attacks on Vision-Language Tasks via Pre-trained Models

1 code implementation NeurIPS 2023 Ziyi Yin, Muchao Ye, Tianrong Zhang, Tianyu Du, Jinguo Zhu, Han Liu, Jinghui Chen, Ting Wang, Fenglong Ma

In this paper, we aim to investigate a new yet practical task to craft image and text perturbations using pre-trained VL models to attack black-box fine-tuned models on different downstream tasks.

Adversarial Robustness

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

2 code implementations CVPR 2023 Hao Li, Jinguo Zhu, Xiaohu Jiang, Xizhou Zhu, Hongsheng Li, Chun Yuan, Xiaohua Wang, Yu Qiao, Xiaogang Wang, Wenhai Wang, Jifeng Dai

In this paper, we propose Uni-Perceiver v2, which is the first generalist model capable of handling major large-scale vision and vision-language tasks with competitive performance.

Language Modelling Multi-Task Learning

Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks

1 code implementation CVPR 2022 Xizhou Zhu, Jinguo Zhu, Hao Li, Xiaoshi Wu, Xiaogang Wang, Hongsheng Li, Xiaohua Wang, Jifeng Dai

The model is pre-trained on several uni-modal and multi-modal tasks, and evaluated on a variety of downstream tasks, including novel tasks that did not appear in the pre-training stage.

Layerwise Optimization by Gradient Decomposition for Continual Learning

no code implementations CVPR 2021 Shixiang Tang, Dapeng Chen, Jinguo Zhu, Shijie Yu, Wanli Ouyang

The gradient for update should be close to the gradient of the new task, consistent with the gradients shared by all old tasks, and orthogonal to the space spanned by the gradients specific to the old tasks.

Continual Learning

Complementary Relation Contrastive Distillation

1 code implementation CVPR 2021 Jinguo Zhu, Shixiang Tang, Dapeng Chen, Shijie Yu, Yakun Liu, Aijun Yang, Mingzhe Rong, Xiaohua Wang

Specifically, we estimate the mutual relation in an anchor-based way and distill the anchor-student relation under the supervision of its corresponding anchor-teacher relation.

Knowledge Distillation Relation

Cannot find the paper you are looking for? You can Submit a new open access paper.