no code implementations • 3 Apr 2024 • Weidi Luo, Siyuan Ma, Xiaogeng Liu, XIAOYU GUO, Chaowei Xiao
With the rapid advancements in Multimodal Large Language Models (MLLMs), securing these models against malicious inputs while aligning them with human values has emerged as a critical challenge.
no code implementations • 26 Mar 2024 • Zhiyuan Yu, Xiaogeng Liu, Shunning Liang, Zach Cameron, Chaowei Xiao, Ning Zhang
Building on the insights from the user study, we also developed a system using AI as the assistant to automate the process of jailbreak prompt generation.
1 code implementation • 14 Mar 2024 • Yu Wang, Xiaogeng Liu, Yu Li, Muhao Chen, Chaowei Xiao
However, with the integration of additional modalities, MLLMs are exposed to new vulnerabilities, rendering them prone to structured-based jailbreak attacks, where semantic content (e. g., "harmful text") has been injected into the images to mislead MLLMs.
1 code implementation • 7 Mar 2024 • Xiaogeng Liu, Zhiyuan Yu, Yizhe Zhang, Ning Zhang, Chaowei Xiao
Large Language Models (LLMs) excel in processing and generating human language, powered by their ability to interpret and follow instructions.
no code implementations • 7 Dec 2023 • Fangzhou Wu, Xiaogeng Liu, Chaowei Xiao
In this paper, we introduce DeceptPrompt, a novel algorithm that can generate adversarial natural language instructions that drive the Code LLMs to generate functionality correct code with vulnerabilities.
2 code implementations • 3 Oct 2023 • Xiaogeng Liu, Nan Xu, Muhao Chen, Chaowei Xiao
In light of these challenges, we intend to answer this question: Can we develop an approach that can automatically generate stealthy jailbreak prompts?
1 code implementation • 15 Jul 2023 • Yechao Zhang, Shengshan Hu, Leo Yu Zhang, Junyu Shi, Minghui Li, Xiaogeng Liu, Wei Wan, Hai Jin
Building on these insights, we explore the impacts of data augmentation and gradient regularization on transferability and identify that the trade-off generally exists in the various training mechanisms, thus building a comprehensive blueprint for the regulation mechanism behind transferability.
1 code implementation • CVPR 2023 • Xiaogeng Liu, Minghui Li, Haoyu Wang, Shengshan Hu, Dengpan Ye, Hai Jin, Libing Wu, Chaowei Xiao
Deep neural networks are proven to be vulnerable to backdoor attacks.
no code implementations • 8 Mar 2022 • Xiaogeng Liu, Haoyu Wang, Yechao Zhang, Fangzhou Wu, Shengshan Hu
The data-centric machine learning aims to find effective ways to build appropriate datasets which can improve the performance of AI models.
1 code implementation • CVPR 2022 • Shengshan Hu, Xiaogeng Liu, Yechao Zhang, Minghui Li, Leo Yu Zhang, Hai Jin, Libing Wu
While deep face recognition (FR) systems have shown amazing performance in identification and verification, they also arouse privacy concerns for their excessive surveillance on users, especially for public face images widely spread on social networks.