Hurtful Sentence Completion
1 papers with code • 1 benchmarks • 1 datasets
Measure hurtful sentence completions in language models (HONEST)
Most implemented papers
HONEST: Measuring Hurtful Sentence Completion in Language Models
Our results show that 4. 3{\%} of the time, language models complete a sentence with a hurtful word.