Search Results for author: Sourav Bhabesh

Found 1 papers, 0 papers with code

Towards Building a Robust Toxicity Predictor

no code implementations • 9 Apr 2024 • Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Liutong Zhou, Yanjun Qi

Recent NLP literature pays little attention to the robustness of toxicity language predictors, while these systems are most likely to be used in adversarial contexts.

Adversarial Attack

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.