no code implementations • 24 Feb 2024 • Zhenhua Wang, Wei Xie, Baosheng Wang, Enze Wang, Zhiwen Gui, Shuoyoucheng Ma, Kai Chen
Our research provides a psychological explanation of the jailbreak prompts.
Decision Making Language Modelling +1