1 code implementation • 19 Jan 2024 • Adib Hasan, Ileana Rugina, Alex Wang
Large Language Models (LLMs) are susceptible to `jailbreaking' prompts, which can induce the generation of harmful content.