1 code implementation • 27 Mar 2024 • Keyan Guo, Ayush Utkarsh, Wenbo Ding, Isabelle Ondracek, Ziming Zhao, Guo Freeman, Nishant Vishwamitra, Hongxin Hu
Online user-generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment.
1 code implementation • 19 Jan 2024 • Mazal Bethany, Brandon Wherry, Nishant Vishwamitra, Peyman Najafirad
This process involves addressing two key problems: (1) the reason for obfuscating unsafe images demands the platform to provide an accurate rationale that must be grounded in unsafe image-specific attributes, and (2) the unsafe regions in the image must be minimally obfuscated while still depicting the safe regions.
no code implementations • 18 Jan 2024 • Mazal Bethany, Athanasios Galiopoulos, Emet Bethany, Mohammad Bahrami Karkevandi, Nishant Vishwamitra, Peyman Najafirad
The critical threat of phishing emails has been further exacerbated by the potential of LLMs to generate highly targeted, personalized, and automated spear phishing attacks.
1 code implementation • 17 Jan 2024 • Mazal Bethany, Brandon Wherry, Emet Bethany, Nishant Vishwamitra, Anthony Rios, Peyman Najafirad
We first study the effectiveness of state-of-the-art approaches and find that they are severely limited against text produced by diverse generators and domains in the real world.
no code implementations • 7 Jan 2024 • Keyan Guo, Alexander Hu, Jaden Mu, Ziheng Shi, Ziming Zhao, Nishant Vishwamitra, Hongxin Hu
Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech by fully utilizing the knowledge base in LLMs, significantly outperforming existing techniques.
1 code implementation • 22 Dec 2023 • Nishant Vishwamitra, Keyan Guo, Farhan Tajwar Romit, Isabelle Ondracek, Long Cheng, Ziming Zhao, Hongxin Hu
HATEGUARD further achieves prompt-based zero-shot detection by automatically generating and updating detection prompts with new derogatory terms and targets in new wave samples to effectively address new waves of online hate.
no code implementations • 22 Dec 2021 • Nishant Vishwamitra, Hongxin Hu, Ziming Zhao, Long Cheng, Feng Luo
We then introduce a new type of multimodal adversarial attacks called decoupling attack in MUROAN that aims to compromise multimodal models by decoupling their fused modalities.
no code implementations • ICLR 2018 • Xiang Zhang, Nishant Vishwamitra, Hongxin Hu, Feng Luo
The numbers of convolution layers and parameters are only increased linearly in Crescendo blocks.