Search Results for author: Rong Bao

Found 4 papers, 2 papers with code

PlugAT: A Plug and Play Module to Defend against Textual Adversarial Attack

no code implementations COLING 2022 Rui Zheng, Rong Bao, Qin Liu, Tao Gui, Qi Zhang, Xuanjing Huang, Rui Xie, Wei Wu

To reduce the potential side effects of using defense modules, we further propose a novel forgetting restricted adversarial training, which filters out bad adversarial examples that impair the performance of original ones.

Adversarial Attack Domain Adaptation +2

Mitigating Reward Hacking via Information-Theoretic Reward Modeling

no code implementations14 Feb 2024 Yuchun Miao, Sen Zhang, Liang Ding, Rong Bao, Lefei Zhang, DaCheng Tao

Inspired by this finding, we propose the Integrated Cluster Deviation Score (ICDS), which quantifies deviations in the latent space, as an indicator of reward overoptimization to facilitate the development of online mitigation strategies.

Orthogonal Subspace Learning for Language Model Continual Learning

1 code implementation22 Oct 2023 Xiao Wang, Tianze Chen, Qiming Ge, Han Xia, Rong Bao, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang

In this paper, we propose orthogonal low-rank adaptation (O-LoRA), a simple and efficient approach for continual learning in language models, effectively mitigating catastrophic forgetting while learning new tasks.

Continual Learning Language Modelling

Robust Lottery Tickets for Pre-trained Language Models

2 code implementations ACL 2022 Rui Zheng, Rong Bao, Yuhao Zhou, Di Liang, Sirui Wang, Wei Wu, Tao Gui, Qi Zhang, Xuanjing Huang

Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models.

Adversarial Robustness

Cannot find the paper you are looking for? You can Submit a new open access paper.