no code implementations • 14 Dec 2023 • Changjiang Li, Ren Pang, Bochuan Cao, Zhaohan Xi, Jinghui Chen, Shouling Ji, Ting Wang
Recent studies have shown that contrastive learning, like supervised learning, is highly vulnerable to backdoor attacks wherein malicious functions are injected into target models, only to be activated by specific triggers.
no code implementations • 15 Nov 2023 • Yuanpu Cao, Bochuan Cao, Jinghui Chen
In this work, we show that it is possible to conduct stealthy and persistent unalignment on large language models via backdoor injections.
1 code implementation • NeurIPS 2023 • Bochuan Cao, Changjiang Li, Ting Wang, Jinyuan Jia, Bo Li, Jinghui Chen
IMPRESS is based on the key observation that imperceptible perturbations could lead to a perceptible inconsistency between the original image and the diffusion-reconstructed image, which can be used to devise a new optimization strategy for purifying the image, which may weaken the protection of the original image from unauthorized data usage (e. g., style mimicking, malicious editing).
no code implementations • 2 Oct 2023 • Hangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin, Jinyuan Jia, Jinghui Chen, Dinghao Wu
A natural question is "could alignment really prevent those open-sourced large language models from being misused to generate undesired content?''.
1 code implementation • 18 Sep 2023 • Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen
In this work, we introduce a Robustly Aligned LLM (RA-LLM) to defend against potential alignment-breaking attacks.
1 code implementation • 25 Nov 2022 • Huaxiu Yao, Caroline Choi, Bochuan Cao, Yoonho Lee, Pang Wei Koh, Chelsea Finn
Temporal shifts -- distribution shifts arising from the passage of time -- often occur gradually and have the additional structure of timestamp metadata.
no code implementations • 12 Jul 2021 • Jiacheng Liang, Songze Li, Bochuan Cao, Wensi Jiang, Chaoyang He
Utilizing OmniLytics, many distributed data owners can contribute their private data to collectively train an ML model requested by some model owners, and receive compensation for data contribution.