Membership inference attack with relative decision boundary distance

7 Jun 2023  ·  Jiacheng Xu, Chengxiang Tan ·

Membership inference attack is one of the most popular privacy attacks in machine learning, which aims to predict whether a given sample was contained in the target model's training set. Label-only membership inference attack is a variant that exploits sample robustness and attracts more attention since it assumes a practical scenario in which the adversary only has access to the predicted labels of the input samples. However, since the decision boundary distance, which measures robustness, is strongly affected by the random initial image, the adversary may get opposite results even for the same input samples. In this paper, we propose a new attack method, called muti-class adaptive membership inference attack in the label-only setting. All decision boundary distances for all target classes have been traversed in the early attack iterations, and the subsequent attack iterations continue with the shortest decision boundary distance to obtain a stable and optimal decision boundary distance. Instead of using a single boundary distance, the relative boundary distance between samples and neighboring points has also been employed as a new membership score to distinguish between member samples inside the training set and nonmember samples outside the training set. Experiments show that previous label-only membership inference attacks using the untargeted HopSkipJump algorithm fail to achieve optimal decision bounds in more than half of the samples, whereas our multi-targeted HopSkipJump algorithm succeeds in almost all samples. In addition, extensive experiments show that our multi-class adaptive MIA outperforms current label-only membership inference attacks in the CIFAR10, and CIFAR100 datasets, especially for the true positive rate at low false positive rates metric.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods