Search Results for author: Kazuhiro Takemoto

Found 5 papers, 4 papers with code

All in How You Ask for It: Simple Black-Box Method for Jailbreak Attacks

1 code implementation18 Jan 2024 Kazuhiro Takemoto

Large Language Models (LLMs), such as ChatGPT, encounter `jailbreak' challenges, wherein safeguards are circumvented to generate ethically harmful prompts.

The Moral Machine Experiment on Large Language Models

1 code implementation12 Sep 2023 Kazuhiro Takemoto

As large language models (LLMs) become more deeply integrated into various sectors, understanding how they make moral judgments has become crucial, particularly in the realm of autonomous driving.

Autonomous Driving Decision Making

Simple black-box universal adversarial attacks on medical image classification based on deep neural networks

no code implementations11 Aug 2021 Kazuki Koga, Kazuhiro Takemoto

In particular, we propose a method for generating UAPs using a simple hill-climbing search based only on DNN outputs and demonstrate the validity of the proposed method using representative DNN-based medical image classifications.

Image Classification Medical Image Classification

Vulnerability of deep neural networks for detecting COVID-19 cases from chest X-ray images to universal adversarial attacks

1 code implementation22 May 2020 Hokuto Hirano, Kazuki Koga, Kazuhiro Takemoto

As an example, we show that iterative fine-tuning of the DNN models using UAPs improves the robustness of the DNN models against UAPs.

COVID-19 Diagnosis

Simple iterative method for generating targeted universal adversarial perturbations

1 code implementation15 Nov 2019 Hokuto Hirano, Kazuhiro Takemoto

Our method combines the simple iterative method for generating non-targeted UAPs and the fast gradient sign method for generating a targeted adversarial perturbation for an input.

General Classification Image Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.