Dialogue Safety Prediction

2 papers with code • 2 benchmarks • 2 datasets

Determine the safety of a given dialogue context.

Benchmarks

Add a Result

These leaderboards are used to track progress in Dialogue Safety Prediction

Trend	Dataset	Best Model	Paper	Code	Compare
	rt-inod-jailbreaking	Baseline			See all
	ProsocialDialog	Canary			See all

Datasets

ProsocialDialog
rt-inod-jailbreaking

Most implemented papers

Most implemented Social Latest No code

ProsocialDialog: A Prosocial Backbone for Conversational Agents

skywalker023/prosocial-dialog • 25 May 2022

With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost.

Paper
Code

Benchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for Hallucinations

innodatalabs/innodata-llm-safety • 15 Apr 2024

In this research, we used OpenAI GPT as point of comparison since it excels at all levels of safety.