XSafety

Introduced by Wang et al. in All Languages Matter: On the Multilingual Safety of Large Language Models

XSafety is the first multilingual safety benchmark specifically designed for Large Language Models (LLMs). The motivation behind creating XSafety stems from the global deployment of LLMs in practical applications. Here are the key points about XSafety:

Purpose: XSafety aims to evaluate the safety of LLMs across multiple languages, recognizing that safety concerns extend beyond English and encompass various linguistic contexts. Coverage: XSafety covers 14 types of commonly encountered safety issues. These issues span across 10 different languages, representing several language families. Empirical Study: The authors utilize XSafety to empirically study the multilingual safety of four widely-used LLMs, including both close-API and open-source models. Notably, they find that all LLMs produce significantly more unsafe responses for non-English queries compared to English ones. This highlights the importance of developing safety alignment for non-English languages. Improving Safety: To enhance multilingual safety, the authors propose several simple and effective prompting methods. These methods evoke safety knowledge and improve cross-lingual generalization of safety alignment. For instance, their proposed method reduces the ratio of unsafe responses for non-English queries from 19.1% to 9.7% in the case of ChatGPT.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

XSafety

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

XSTest

CValues

Usage

License

Modalities

Languages

XSafety

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

XSTest

CValues

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages