no code implementations • 9 May 2024 • Yang Bai, Ge Pei, Jindong Gu, Yong Yang, Xingjun Ma
In this paper, we take a step further and show that certain special characters or their combinations with English letters are stronger memory triggers, leading to more severe data leakage.
no code implementations • 25 Apr 2024 • Kuofeng Gao, Jindong Gu, Yang Bai, Shu-Tao Xia, Philip Torr, Wei Liu, Zhifeng Li
For verbose videos, a frame feature diversity loss is proposed to increase the feature diversity among frames.
2 code implementations • 11 Apr 2024 • Runtao Liu, Ashkan Khakzar, Jindong Gu, Qifeng Chen, Philip Torr, Fabio Pizzati
Hence, we propose Latent Guard, a framework designed to improve safety measures in text-to-image generation.
no code implementations • 8 Apr 2024 • Jindong Gu
To answer the question, this paper investigates the practical responsible requirements of both textual and visual generative models, outlining five key considerations: generating truthful content, avoiding toxic content, refusing harmful instruction, leaking no training data-related content, and ensuring generated content identifiable.
no code implementations • 4 Apr 2024 • Shuo Chen, Zhen Han, Bailan He, Zifeng Ding, Wenqian Yu, Philip Torr, Volker Tresp, Jindong Gu
Various jailbreak attacks have been proposed to red-team Large Language Models (LLMs) and revealed the vulnerable safeguards of LLMs.
no code implementations • 3 Apr 2024 • Fengyuan Liu, Haochen Luo, Yiming Li, Philip Torr, Jindong Gu
In this work, we study the origin attribution of generated images in a practical setting where only a few images generated by a source model are available and the source model cannot be accessed.
no code implementations • 19 Mar 2024 • Anjun Hu, Jindong Gu, Francesco Pinto, Konstantinos Kamnitsas, Philip Torr
Foundation models pre-trained on web-scale vision-language data, such as CLIP, are widely used as cornerstones of powerful machine learning systems.
1 code implementation • 14 Mar 2024 • Haochen Luo, Jindong Gu, Fengyuan Liu, Philip Torr
Given that VLMs rely on prompts to adapt to different tasks, an intriguing question emerges: Can a single adversarial image mislead all predictions of VLMs when a thousand different prompts are given?
1 code implementation • 8 Mar 2024 • Tianrui Lou, Xiaojun Jia, Jindong Gu, Li Liu, Siyuan Liang, Bangyan He, Xiaochun Cao
We find that concealing deformation perturbations in areas insensitive to human eyes can achieve a better trade-off between imperceptibility and adversarial strength, specifically in parts of the object surface that are complex and exhibit drastic curvature changes.
no code implementations • 29 Feb 2024 • Hao Cheng, Erjia Xiao, Jindong Gu, Le Yang, Jinhao Duan, Jize Zhang, Jiahang Cao, Kaidi Xu, Renjing Xu
Large Vision-Language Models (LVLMs) rely on vision encoders and Large Language Models (LLMs) to exhibit remarkable capabilities on various multi-modal tasks in the joint space of vision and language.
no code implementations • 22 Feb 2024 • Zefeng Wang, Zhen Han, Shuo Chen, Fan Xue, Zifeng Ding, Xun Xiao, Volker Tresp, Philip Torr, Jindong Gu
Our research evaluates the adversarial robustness of MLLMs when employing CoT reasoning, finding that CoT marginally improves adversarial robustness against existing attack methods.
1 code implementation • 20 Jan 2024 • Kuofeng Gao, Yang Bai, Jindong Gu, Shu-Tao Xia, Philip Torr, Zhifeng Li, Wei Liu
Once attackers maliciously induce high energy consumption and latency time (energy-latency cost) during inference of VLMs, it will exhaust computational resources.
no code implementations • 31 Dec 2023 • Xinwei Liu, Xiaojun Jia, Jindong Gu, Yuan Xun, Siyuan Liang, Xiaochun Cao
However, in this paper, we propose the Few-shot Learning Backdoor Attack (FLBA) to show that FSL can still be vulnerable to backdoor attacks.
1 code implementation • 29 Dec 2023 • Xingqiao Li, Jindong Gu, Zhiyong Wang, Yancheng Yuan, Bo Du, Fengxiang He
To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data.
1 code implementation • 10 Dec 2023 • Andong Hua, Jindong Gu, Zhiyu Xue, Nicholas Carlini, Eric Wong, Yao Qin
Based on this, we propose Robust Linear Initialization (RoLI) for adversarial finetuning, which initializes the linear head with the weights obtained by adversarial linear probing to maximally inherit the robustness from pretraining.
no code implementations • 7 Dec 2023 • Dongchen Han, Xiaojun Jia, Yang Bai, Jindong Gu, Yang Liu, Xiaochun Cao
Investigating the generation of high-transferability adversarial examples is crucial for uncovering VLP models' vulnerabilities in practical scenarios.
no code implementations • 3 Dec 2023 • Xiaojun Jia, Jindong Gu, Yihao Huang, Simeng Qin, Qing Guo, Yang Liu, Xiaochun Cao
At the second stage, the pixels are divided into different branches based on their transferable property which is dependent on Kullback-Leibler divergence.
no code implementations • 30 Nov 2023 • Avery Ma, Amir-Massoud Farahmand, Yangchen Pan, Philip Torr, Jindong Gu
During the alignment process, the parameters of the source model are fine-tuned to minimize an alignment loss.
1 code implementation • 29 Nov 2023 • Xin Liu, Yichen Zhu, Jindong Gu, Yunshi Lan, Chao Yang, Yu Qiao
The security concerns surrounding Large Language Models (LLMs) have been extensively explored, yet the safety of Multimodal Large Language Models (MLLMs) remains understudied.
no code implementations • 29 Nov 2023 • Shuo Chen, Zhen Han, Bailan He, Mark Buckley, Philip Torr, Volker Tresp, Jindong Gu
Our findings indicate that ICL in VLMs is predominantly driven by the textual information in the demonstrations whereas the visual information in the demonstrations barely affects the ICL performance.
no code implementations • 28 Nov 2023 • Hang Li, Chengzhi Shen, Philip Torr, Volker Tresp, Jindong Gu
A risk with these models is the potential generation of inappropriate content, such as biased or harmful images.
no code implementations • 24 Nov 2023 • Shitong Sun, Jindong Gu, Shaogang Gong
In this paper, we perform the first robustness study and establish three new diversified benchmarks for systematic analysis of text-image composed retrieval against natural corruptions in both vision and text and further probe textural understanding.
no code implementations • 21 Nov 2023 • Gengyuan Zhang, Jinhe Bi, Jindong Gu, Yanyu Chen, Volker Tresp
This raises a question: with such weak supervision, can video representation in video-language models gain the ability to distinguish even factual discrepancies in textual description and understand fine-grained events?
1 code implementation • 26 Oct 2023 • Jindong Gu, Xiaojun Jia, Pau de Jorge, Wenqain Yu, Xinwei Liu, Avery Ma, Yuan Xun, Anjun Hu, Ashkan Khakzar, Zhijiang Li, Xiaochun Cao, Philip Torr
This survey explores the landscape of the adversarial transferability of adversarial examples.
1 code implementation • 24 Oct 2023 • Xiaojun Jia, Jianshu Li, Jindong Gu, Yang Bai, Xiaochun Cao
Besides, we provide theoretical analysis to show the model robustness can be improved by the single-step adversarial training with sampled subnetworks.
1 code implementation • 15 Sep 2023 • Zhihao Hu, Yiran Xu, Mengnan Du, Jindong Gu, Xinmei Tian, Fengxiang He
Our adaptive reweighing method prioritizes samples closer to the decision boundary and assigns a higher weight to improve the generalizability of fair classifiers.
no code implementations • 12 Sep 2023 • Jindong Gu, Fangyun Wei, Philip Torr, Han Hu
In this work, we first taxonomize the stochastic defense strategies against QBBA.
no code implementations • 22 Aug 2023 • Xiaojun Jia, Yuefeng Chen, Xiaofeng Mao, Ranjie Duan, Jindong Gu, Rong Zhang, Hui Xue, Xiaochun Cao
In this paper, we conduct a comprehensive study of over 10 fast adversarial training methods in terms of adversarial robustness and training costs.
1 code implementation • ICCV 2023 • Gengyuan Zhang, Jisen Ren, Jindong Gu, Volker Tresp
In this study, we introduce the Multi-event Video-Text Retrieval (MeVTR) task, addressing scenarios in which each video contains multiple different events, as a niche scenario of the conventional Video-Text Retrieval Task.
no code implementations • 21 Aug 2023 • Haokun Chen, Yao Zhang, Denis Krompass, Jindong Gu, Volker Tresp
FedDAT is the first approach that enables an efficient distributed finetuning of foundation models for a variety of heterogeneous Vision-Language tasks.
1 code implementation • 17 Aug 2023 • Xuanlong Yu, Gianni Franchi, Jindong Gu, Emanuel Aldea
In this work, we propose a generalized AuxUE scheme for more robust uncertainty quantification on regression tasks.
no code implementations • 16 Aug 2023 • Haokun Chen, Denis Krompass, Jindong Gu, Volker Tresp
Similar to conventional ML pipelines, the client local optimization and server aggregation procedure in FL are sensitive to the hyperparameter (HP) selection.
1 code implementation • 24 Jul 2023 • Jindong Gu, Zhen Han, Shuo Chen, Ahmad Beirami, Bailan He, Gengyuan Zhang, Ruotong Liao, Yao Qin, Volker Tresp, Philip Torr
This paper aims to provide a comprehensive survey of cutting-edge research in prompt engineering on three types of vision-language models: multimodal-to-text generation models (e. g. Flamingo), image-text matching models (e. g.
no code implementations • 14 Jun 2023 • Wenqian Yu, Jindong Gu, Zhijiang Li, Philip Torr
Adversarial examples (AEs) with small adversarial perturbations can mislead deep neural networks (DNNs) into wrong predictions.
no code implementations • 17 Apr 2023 • Jindong Gu, Ahmad Beirami, Xuezhi Wang, Alex Beutel, Philip Torr, Yao Qin
With the advent of vision-language models (VLMs) that can perform in-context and prompt-based learning, how can we design prompting approaches that robustly generalize to distribution shift and can be used on novel classes outside the support set of the prompts?
1 code implementation • CVPR 2023 • Kuofeng Gao, Yang Bai, Jindong Gu, Yong Yang, Shu-Tao Xia
With the split clean data pool and polluted data pool, ASD successfully defends against backdoor attacks during training.
1 code implementation • 21 Mar 2023 • Haoheng Lan, Jindong Gu, Philip Torr, Hengshuang Zhao
In this work, we explore backdoor attacks on segmentation models to misclassify all pixels of a victim class by injecting a specific trigger on non-victim pixels during inferences, which is dubbed Influencer Backdoor Attack (IBA).
no code implementations • 3 Jan 2023 • Jindong Gu
The vulnerability of deep neural networks poses challenges to current visual classification models.
no code implementations • ICCV 2023 • Hang Li, Jindong Gu, Rajat Koner, Sahand Sharifzadeh, Volker Tresp
To study this question, we propose a reconstruction task where Flamingo generates a description for a given image and DALL-E uses this description as input to synthesize a new image.
no code implementations • 19 Nov 2022 • Yao Zhang, Haokun Chen, Ahmed Frikha, Yezi Yang, Denis Krompass, Gengyuan Zhang, Jindong Gu, Volker Tresp
Visual Question Answering (VQA) is a multi-discipline research task.
1 code implementation • 25 Jul 2022 • Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr
Since SegPGD can create more effective adversarial examples, the adversarial training with our SegPGD can boost the robustness of segmentation models.
no code implementations • 21 Jul 2022 • Boxi Wu, Jindong Gu, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu
Vision Transformer (ViT), as a powerful alternative to Convolutional Neural Network (CNN), has received much attention.
1 code implementation • 17 Jul 2022 • Xinwei Liu, Jian Liu, Yang Bai, Jindong Gu, Tao Chen, Xiaojun Jia, Xiaochun Cao
Inspired by the vulnerability of DNNs on adversarial perturbations, we propose a novel defence mechanism by adversarial machine learning for good.
no code implementations • ICCV 2023 • Haokun Chen, Ahmed Frikha, Denis Krompass, Jindong Gu, Volker Tresp
Real-world applications usually involve a distribution shift across the datasets of the different clients, which hurts the generalization ability of the clients to unseen samples from their respective data distributions.
no code implementations • 17 Mar 2022 • Zhen Han, Ruotong Liao, Jindong Gu, Yao Zhang, Zifeng Ding, Yujia Gu, Heinz Köppl, Hinrich Schütze, Volker Tresp
Since conventional knowledge embedding models cannot take full advantage of the abundant textual information, there have been extensive research efforts in enhancing knowledge embedding using texts.
no code implementations • 22 Nov 2021 • Jindong Gu, Hengshuang Zhao, Volker Tresp, Philip Torr
The high transferability achieved by our method shows that, in contrast to the observations in previous work, adversarial examples on a segmentation model can be easy to transfer to other segmentation models.
no code implementations • 20 Nov 2021 • Jindong Gu, Volker Tresp, Yao Qin
However, when ViTs are attacked by an adversary, the attention mechanism can be easily fooled to focus more on the adversarially perturbed patches and cause a mistake.
no code implementations • 29 Sep 2021 • Jindong Gu, Volker Tresp, Yao Qin
Based on extensive qualitative and quantitative experiments, we discover that ViT's stronger robustness to natural corrupted patches and higher vulnerability against adversarial patches are both caused by the attention mechanism.
1 code implementation • 21 Jun 2021 • Jindong Gu, Wei Liu, Yonglong Tian
While large self-supervised models have rivalled the performance of their supervised counterparts, small models still struggle.
no code implementations • 9 Jun 2021 • Boxi Wu, Heng Pan, Li Shen, Jindong Gu, Shuai Zhao, Zhifeng Li, Deng Cai, Xiaofei He, Wei Liu
In this work, we find that the adversarial attacks can also be vulnerable to small perturbations.
1 code implementation • 1 Jun 2021 • Zhiliang Wu, Yinchong Yang, Jindong Gu, Volker Tresp
We propose an uncertainty-aware deep kernel learning model which permits the estimation of the uncertainty in the prediction by a pipeline of a Convolutional Neural Network and a sparse Gaussian Process.
no code implementations • CVPR 2021 • Jindong Gu, Volker Tresp, Han Hu
The examination reveals five major new/different components in CapsNet: a transformation process, a dynamic routing layer, a squashing function, a marginal loss other than cross-entropy loss, and an additional class-conditional reconstruction loss for regularization.
1 code implementation • ICLR 2021 • Jindong Gu, Baoyuan Wu, Volker Tresp
As alternatives to CNNs, the recently proposed Capsule Networks (CapsNets) are shown to be more robust to white-box attacks than CNNs under popular attack protocols.
no code implementations • 3 Dec 2020 • Jindong Gu, Volker Tresp
In the proposed model, individual classification explanations can be created effectively and efficiently.
no code implementations • 19 Sep 2020 • Jindong Gu, Zhiliang Wu, Volker Tresp
Motivated by the conclusion, we propose an implementation of introspective learning by distilling knowledge from online self-explanations.
no code implementations • 30 Jan 2020 • Jindong Gu, Volker Tresp
The knowledge of a well-performed teacher is distilled to a student with a small architecture.
no code implementations • 21 Nov 2019 • Jindong Gu, Volker Tresp
What is the difference between DNNs trained with random labels and the ones trained with true labels?
no code implementations • CVPR 2020 • Jindong Gu, Volker Tresp
Our investigation reveals that the routing procedure contributes neither to the generalization ability nor to the affine robustness of the CapsNets.
no code implementations • 21 Oct 2019 • Jindong Gu, Volker Tresp
In this work, we first show that PDA can suffer from saturated classifiers.
no code implementations • 21 Oct 2019 • Jindong Gu, Volker Tresp
Deep neural networks (DNNs) with high expressiveness have achieved state-of-the-art performance in many tasks.
no code implementations • 2 Sep 2019 • Jindong Gu, Daniela Oelke
Bias is known to be an impediment to fair decisions in many domains such as human resources, the public sector, health care etc.
no code implementations • 22 Aug 2019 • Jindong Gu, Volker Tresp
The idea behind saliency methods is to explain the classification decisions of neural networks by creating so-called saliency maps.
2 code implementations • 5 Dec 2018 • Jindong Gu, Yinchong Yang, Volker Tresp
The experiments and analysis conclude that the explanations generated by LRP are not class-discriminative.
no code implementations • ICLR 2018 • Jindong Gu, Matthias Schubert, Volker Tresp
In the adversarial process of training CorGAN, the Generator is supposed to generate outlier samples for negative class, and the Discriminator as an one-class classifier is trained to distinguish data from training datasets (i. e. positive class) and generated data from the Generator (i. e. negative class).