no code implementations • 21 Feb 2024 • Xiaoxia Li, Siyuan Liang, Jiyi Zhang, Han Fang, Aishan Liu, Ee-Chien Chang
Large Language Models (LLMs), used in creative writing, code generation, and translation, generate text based on input sequences but are vulnerable to jailbreak attacks, where crafted prompts induce harmful outputs.
no code implementations • 7 Feb 2024 • Jiyi Zhang, Han Fang, Ee-Chien Chang
In forensic investigations of machine learning models, techniques that determine a model's data domain play an essential role, with prior work relying on large-scale corpora like ImageNet to approximate the target model's domain.
no code implementations • 2 Jun 2023 • Jiyi Zhang, Han Fang, Ee-Chien Chang
This induces different adversarial regions in different copies, making adversarial samples generated on one copy not replicable on others.
no code implementations • 10 May 2023 • Jiyi Zhang, Han Fang, Hwee Kuan Lee, Ee-Chien Chang
Our goal is to select a set of samples from the corpus for the given model.
no code implementations • ICCV 2023 • Han Fang, Jiyi Zhang, Yupeng Qiu, Ke Xu, Chengfang Fang, Ee-Chien Chang
In this paper, we take the role of investigators who want to trace the attack and identify the source, that is, the particular model which the adversarial examples are generated from.
no code implementations • 30 Nov 2021 • Jiyi Zhang, Han Fang, Wesley Joon-Wie Tann, Ke Xu, Chengfang Fang, Ee-Chien Chang
We point out that by distributing different copies of the model to different buyers, we can mitigate the attack such that adversarial samples found on one copy would not work on another copy.
no code implementations • 5 Mar 2020 • Jiyi Zhang, Ee-Chien Chang, Hwee Kuan Lee
Many machine learning adversarial attacks find adversarial samples of a victim model ${\mathcal M}$ by following the gradient of some attack objective functions, either explicitly or implicitly.
no code implementations • 13 Feb 2018 • Jiyi Zhang, Hung Dang, Hwee Kuan Lee, Ee-Chien Chang
We propose a flipped-Adversarial AutoEncoder (FAAE) that simultaneously trains a generative model G that maps an arbitrary latent code distribution to a data distribution and an encoder E that embodies an "inverse mapping" that encodes a data sample into a latent code vector.