Font Size: a A A

Adversarial Attack Methods Based On Classification Models

Posted on:2023-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:B H ZengFull Text:PDF
GTID:2558307079959389Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Deep Neural Network(DNN)has achieved remarkable achievements in image classification tasks in recent years.Despite this,research in the field of adversarial machine learning reveals the unreliability of DNN.By adding carefully crafted perturbations(i.e.,adversarial attacks)to normal images,adversarial examples that can easily deceive stateof-the-art DNN can be generated,which raises doubts about the stability of deployed models.In addition,Vision Transformer(Vi T),trained on large-scale datasets,can achieve high performance on image classification tasks.Many studies have shown that Vi T has higher robustness than DNN,but Vi T itself also has security risks,and Vi T’s own characteristics may also be exploited to attack.Therefore,exploring the vulnerability of DNN and Vi T is very important.Black-box attack is the most challenging and practical setting of adversarial attacks,because usually the attacker cannot know the relevant information of the victim model.In this case,adversarial examples are usually used to attack the victim model by exploiting their transferability.This transferability-based attack uses a substitute model(white-box model)to generate adversarial examples and then attacks the victim model.Due to the difference between the substitute model and the victim model,the transferability of adversarial examples on the victim model may be very low.Model enhancement methods are usually used to improve the transferability of adversarial examples,by generating adversarial examples with loss-preserving input transformations.This thesis enhances the transferability of adversarial examples based on model enhancement methods:1.Boundary Fitting Fast Gradient Sign Method(BF-FGSM).This thesis focuses on the decision boundary and proposes the decision boundary distance to measure the attack performance and model robustness.Based on the observation that different models have more similar gradients at decision boundary points,this thesis proposes BF-FGSM: in each iteration,by sampling decision boundary points in batches,aggregating gradients,to fit different model decision boundaries,to generate more transferable adversarial examples.2.Perturbation-invariant Fast Gradient Sign Method(PI-FGSM).This thesis explores the characteristics of the Vi Ts and focuses on their sensitivity to spatial structure.This thesis finds that Vi Ts are relatively insensitive to spatial structure changes,and can still correctly classify images after randomly shuffling them.Based on this perturbation-invariant characteristic of Vi Ts,this thesis proposes PI-FGSM: in each iteration,by randomly shuffling the original image,using this loss-preserving transformation for model enhancement to improve the transferability of adversarial examples.A large number of experiments have demonstrated the effectiveness of the methods proposed in this thesis.I hope that these proposed methods can serve as baselines,help generate more transferable adversarial examples,and evaluate the robustness of various models.
Keywords/Search Tags:Adversarial Example, Boundary Fitting Fast Gradient Sign Method, Perturbation-invariant Fast Gradient Sign Method
PDF Full Text Request
Related items