Search Results for author: Jianing Zhu

Found 9 papers, 7 papers with code

DeepInception: Hypnotize Large Language Model to Be Jailbreaker

1 code implementation • 6 Nov 2023 • Xuan Li, Zhanke Zhou, Jianing Zhu, Jiangchao Yao, Tongliang Liu, Bo Han

Despite remarkable success in various applications, large language models (LLMs) are vulnerable to adversarial jailbreaks that make the safety guardrails void.

Language Modelling Large Language Model

Paper
Code

Exploring Model Dynamics for Accumulative Poisoning Discovery

1 code implementation • 6 Jun 2023 • Jianing Zhu, Xiawei Guo, Jiangchao Yao, Chao Du, Li He, Shuo Yuan, Tongliang Liu, Liang Wang, Bo Han

In this paper, we dive into the perspective of model dynamics and propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.

Memorization

Paper
Code

Unleashing Mask: Explore the Intrinsic Out-of-Distribution Detection Capability

1 code implementation • 6 Jun 2023 • Jianing Zhu, Hengzhuang Li, Jiangchao Yao, Tongliang Liu, Jianliang Xu, Bo Han

Based on such insights, we propose a novel method, Unleashing Mask, which aims to restore the OOD discriminative capabilities of the well-trained model with ID data.

Ranked #10 on Out-of-Distribution Detection on ImageNet-1k vs iNaturalist

Out-of-Distribution Detection

Paper
Code

Combating Exacerbated Heterogeneity for Robust Models in Federated Learning

1 code implementation • 1 Mar 2023 • Jianing Zhu, Jiangchao Yao, Tongliang Liu, Quanming Yao, Jianliang Xu, Bo Han

Privacy and security concerns in real-world applications have led to the development of adversarially robust federated models.

Federated Learning

Paper
Code

Adversarial Training with Complementary Labels: On the Benefit of Gradually Informative Attacks

1 code implementation • 1 Nov 2022 • Jianan Zhou, Jianing Zhu, Jingfeng Zhang, Tongliang Liu, Gang Niu, Bo Han, Masashi Sugiyama

Adversarial training (AT) with imperfect supervision is significant but receives limited attention.

Adversarial Robustness Pseudo Label

Paper
Code

$\alpha$-Weighted Federated Adversarial Training

no code implementations • 29 Sep 2021 • Jianing Zhu, Jiangchao Yao, Tongliang Liu, Kunyang Jia, Jingren Zhou, Bo Han, Hongxia Yang

Federated Adversarial Training (FAT) helps us address the data privacy and governance issues, meanwhile maintains the model robustness to the adversarial attack.

Adversarial Attack Federated Learning

Paper
Add Code

Reliable Adversarial Distillation with Unreliable Teachers

2 code implementations • ICLR 2022 • Jianing Zhu, Jiangchao Yao, Bo Han, Jingfeng Zhang, Tongliang Liu, Gang Niu, Jingren Zhou, Jianliang Xu, Hongxia Yang

However, when considering adversarial robustness, teachers may become unreliable and adversarial distillation may not work: teachers are pretrained on their own adversarial data, and it is too demanding to require that teachers are also good at every adversarial data queried by students.

Adversarial Robustness

Paper
Code

Understanding the Interaction of Adversarial Training with Noisy Labels

no code implementations • 6 Feb 2021 • Jianing Zhu, Jingfeng Zhang, Bo Han, Tongliang Liu, Gang Niu, Hongxia Yang, Mohan Kankanhalli, Masashi Sugiyama

A recent adversarial training (AT) study showed that the number of projected gradient descent (PGD) steps to successfully attack a point (i. e., find an adversarial example in its proximity) is an effective measure of the robustness of this point.

Paper
Add Code

Geometry-aware Instance-reweighted Adversarial Training

2 code implementations • ICLR 2021 • Jingfeng Zhang, Jianing Zhu, Gang Niu, Bo Han, Masashi Sugiyama, Mohan Kankanhalli

The belief was challenged by recent studies where we can maintain the robustness and improve the accuracy.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.