Search Results for author: Xinyue Shen

Found 10 papers, 5 papers with code

Comprehensive Assessment of Jailbreak Attacks Against LLMs

no code implementations • 8 Feb 2024 • Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang

Some jailbreak prompt datasets, available from the Internet, can also achieve high attack success rates on many LLMs, such as ChatGLM3, GPT-3. 5, and PaLM2.

Ethics

Paper
Add Code

Comprehensive Assessment of Toxicity in ChatGPT

no code implementations • 3 Nov 2023 • Boyang Zhang, Xinyue Shen, Wai Man Si, Zeyang Sha, Zeyuan Chen, Ahmed Salem, Yun Shen, Michael Backes, Yang Zhang

Moderating offensive, hateful, and toxic language has always been an important but challenging topic in the domain of safe use in NLP.

Paper
Add Code

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

1 code implementation • 7 Aug 2023 • Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, Yang Zhang

The misuse of large language models (LLMs) has garnered significant attention from the general public and LLM vendors.

Community Detection

190

Paper
Code

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

1 code implementation • 23 May 2023 • Yiting Qu, Xinyue Shen, Xinlei He, Michael Backes, Savvas Zannettou, Yang Zhang

Our evaluation result shows that 24% of the generated images using DreamBooth are hateful meme variants that present the features of the original hateful meme and the target individual/community; these generated images are comparable to hateful meme variants collected from the real world.

Paper
Code

In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT

no code implementations • 18 Apr 2023 • Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang

In this paper, we perform the first large-scale measurement of ChatGPT's reliability in the generic QA scenario with a carefully curated set of 5, 695 questions across ten datasets and eight domains.

Question Answering

Paper
Add Code

MGTBench: Benchmarking Machine-Generated Text Detection

2 code implementations • 26 Mar 2023 • Xinlei He, Xinyue Shen, Zeyuan Chen, Michael Backes, Yang Zhang

Extensive evaluations on public datasets with curated texts generated by various powerful LLMs such as ChatGPT-turbo and Claude demonstrate the effectiveness of different detection methods.

Benchmarking Question Answering +4

123

Paper
Code

Prompt Stealing Attacks Against Text-to-Image Generation Models

1 code implementation • 20 Feb 2023 • Xinyue Shen, Yiting Qu, Michael Backes, Yang Zhang

In this paper, we perform the first study on understanding the threat of a novel attack, namely prompt stealing attack, which aims to steal prompts from generated images by text-to-image generation models.

Text-to-Image Generation

Paper
Code

Backdoor Attacks in the Supply Chain of Masked Image Modeling

no code implementations • 4 Oct 2022 • Xinyue Shen, Xinlei He, Zheng Li, Yun Shen, Michael Backes, Yang Zhang

Different from previous work, we are the first to systematically threat modeling on SSL in every phase of the model supply chain, i. e., pre-training, release, and downstream phases.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Nonconvex Sparse Logistic Regression with Weakly Convex Regularization

no code implementations • 7 Aug 2017 • Xinyue Shen, Yuantao Gu

In this work we propose to fit a sparse logistic regression model by a weakly convex regularized nonconvex optimization problem.

regression

Paper
Add Code

Disciplined Multi-Convex Programming

3 code implementations • 12 Sep 2016 • Xinyue Shen, Steven Diamond, Madeleine Udell, Yuantao Gu, Stephen Boyd

A multi-convex optimization problem is one in which the variables can be partitioned into sets over which the problem is convex when the other variables are fixed.

Optimization and Control

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.