Research On Black-box Adversarial Attack Method For Image Classification Neural Networks

Posted on:2024-06-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Xu

Full Text:PDF

GTID:2568306932955989

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the rapid development of deep learning technology,artificial intelligence has made significant progress in the field of image classification.Neural network models designed and trained using deep learning techniques can achieve higher classification accuracy than the human eye,and are therefore widely used in areas such as image recognition and face recognition.Although the application of neural networks can bring great convenience to people’s lives,it also faces some security issues.Research has shown that neural networks are vulnerable to adversarial examples,which can lead to security risks when applied to systems with high security requirements,such as face recognition and autonomous driving systems that may crash and fail under adversarial attacks.Adversarial examples have become a challenge that must be recognized and overcome before deep learning models can be implemented.Therefore,research on adversarial examples is important.Adversarial attacks,depending on the information known to the attacker,can be divided into white-box and black-box attacks.Black box attack,where the attacker does not have access to information about the model,is a more realistic attack scenario.Currently,the mainstream research in black-box attacks is transfer-based attack,where the attacker has no access to information about the target model and generates an adversarial example on a white-box surrogate model.Then the generated sample is used to attack the black-box target model directly.The mainstream attack methods are gradient iteration-based attacks,which perform well on white-box models but are not sufficiently transferable when attacking black-box models.Researchers have proposed a number of methods to improve the transferability of adversarial examples.However,existing transfer-based attacks are still insufficient when attacking black-box models,especially defense models.Therefore,this dissertation investigates how to improve the transferability of adversarial examples in transfer-based attack scenarios.The main contributions of the dissertation are summarized as follows:1.Analyzing adversarial transferability from a theoretical perspectiveIn this dissertation,regarding the adversarial example generation as the dual optimization process of model training,we utilize theory from the field of model generalization to analyze the adversarial transferability.Using the PAC-Bayes framework,we find that adversarial transferability is mainly related to three factors,namely the optimization algorithm,the model diversity and the loss flatness.Therefore,improving the adversarial transferability can start from these three perspectives:improving the optimization algorithm,applying model augmentation and flattening the loss flatness.The first two can well explain current attack methods based on optimization algorithms and input diversity.2.Improving adversarial transferability based on input diversityApplying image transformations to increase input diversity is a mainstream method to improve the transferability of adversarial examples.It is an indirect method of model augmentation.However,existing image transformations have many limitations in improving input diversity.To address these shortcomings,this dissertation proposes a concise and effective image transform and optimizes the multi-branch transform structure,resulting in an attack method with more input diversity and greater transferability.The results of ablation experiments demonstrate the effectiveness of various parts of the proposed method,and the results of comparison experiments show that the proposed method can improve the adversarial transferability effectively.3.Improving adversarial transferability based on input loss flatnessAccording to above theoretical analysis,the loss flatness can affect the adversarial transferability.Therefore,this dissertation proposes an attack method that makes the input loss flatter.Specifically,we first verify the correlation between input loss flatness and the adversarial transferability experimentally,and then propose a simple optimization objective to flatten the loss flatness of the generated examples,thereby enhancing the transferability.Extensive experiments are conducted on various neural network models,and the experimental results show that the proposed method can effectively flatten the input loss and successfully improve the adversarial transferability.Besides,it can be combined with optimization-based methods and input diversity-based methods to further improve the adversarial transferability.

Keywords/Search Tags:

neural network, image classification, adversarial example, black-box attack, adversarial transferability

PDF Full Text Request

Related items

1	Research On Black-box Adversarial Attack Method For Image Classification Neural Networks
2	Research On Attention-Guided Image Adversarial Attack
3	Research On Adversarial Attack Method For Deep Neural Network Image Classification Model
4	Research On Black-box Attack Methods In Image Adversarial Examples Generation
5	Research On Image Classification Adversarial Attack Based On Multiobjective Evolution Model In The Black-box Scenario
6	Research On Adversarial Example Generation Algorithm For Black Box Attack For Image Classification
7	Research On Image Adversarial Sample Attacks Based On Deep Learnin
8	Research And Implementation Method For Image Adversarial Example Generation
9	Image Classification Based Adversarial Attack
10	Research On Enhancement Of Image Adversarial Samples Transferability In Deep Learning Based On Semantic Features