Font Size: a A A

Research On Image Classification Adversarial Example Method Based On Black Box Attack

Posted on:2024-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y R LiuFull Text:PDF
GTID:2558307181954029Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,deep learning has made remarkable achievements in the field of computer vision.However,studies have shown that adversarial samples crafted by adding tiny perturbations to benign samples can seriously influence the prediction and judgment of deep learning models.An adversarial example with good transferability can affect the prediction results of models with different architectures.The adversarial examples are regarded as severe challenges to the wide application of deep learning models in essential fields such as industry,medical care,and national defense.On the other hand,it also provides evaluation criteria for evaluating the robustness of deep learning models and detecting algorithmic defects of models.Therefore,proposing adversarial attack methods and crafting adversarial samples has become a hot spot in the intersection of artificial intelligence and network security.This thesis focuses on the upstream core tasks of computer vision,image recognition tasks,and considers the blackbox attack settings that are the most difficult to attack and most suitable for real application scenarios.Based on an in-depth investigation of existing adversarial attack methods,two new data-free universal adversarial attack methods are proposed to address the issues of poor transferability and attack effectiveness of existing methods,which improve the transferability of adversarial samples made with universal adversarial perturbations and enhance the attack.The contributions are as follows:(1)A data-free universal attack method for weighted maximization activation is proposed.The existing methods of data-free universal adversarial attack only maximize the activation value of the output of all convolution layers of the model,without considering the differences in the features extracted by different convolutional layers.Which leads to poor transferability of adversarial samples.This thesis proposes a weighted maximization activation target to train the universal adversarial perturbation,and assigns corresponding weights to each convolution layer of the model to control the training of different convolution layers on the universal adversarial perturbation,so that the general adversarial perturbation can learn the generalization of the output of the shallow convolutional layer features,which improves the transferability of adversarial examples.The ablation experiment proves the effectiveness of the weighted maximization activation method,and the comparison experiment shows that compared with the classic data-free general attack,the adversarial samples made by the weighted maximization activation method have good transferability and attack effect.(2)A data-free universal adversarial attack method for truncated ratio maximizing activations is proposed.The data-free universal attack method uses the activation values output by the model convolution layer as prior knowledge to craft universal adversarial perturbations.The activation value output by the convolutional layer only retains the positive value,while the negative value in the original output value is discarded,and part of the convolutional layer information is lost.Due to the shortage of real samples,the adversarial attacks have a bad performance by maximizing the activation value of the convolution layer.In order to further improve the attack effect of adversarial samples,based on the weighted maximization activation method,the thesis proposes a truncated ratio maximization activation formulation that combines the positive activation and negative activation of the convolutional layer.By maximizing the positive activation and minimizing the negative activation,this method can further maximize the output value of the activation convolution layer to achieve the purpose of extracting features from the interference model.The verification experiment proves the effectiveness of negative activation,and the comparison experiment shows that the truncation ratio maximization method has a better attack performance than the other data-free universal adversarial attack method.Moreover,this thesis also conducts analysis experiments on universal adversarial perturbations and the properties of adversarial samples.And it is concluded that the local features extracted by the shallow convolution layer of the convolution neural network model are conducive to the training of universal adversarial perturbation.It provides a corresponding research basis for the improvement of subsequent adversarial attacks and the exploration of the properties of adversarial samples.
Keywords/Search Tags:Computer vision, image classification, black-box attack, universal attack, data-free fashion
PDF Full Text Request
Related items