Research On Adversarial Attack And Robustness Enhancement Based On Interpretability

Posted on:2023-10-15

Degree:Master

Type:Thesis

Country:China

Candidate:Q Chen

Full Text:PDF

GTID:2568306818983729

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Deep Neural Network(DNN)has achieved great success in computer vision,speech recognition,natural language processing and other aspects.However,due to its opacity,when users get decision results,they often cannot understand its prediction process,what effective features the model has learned,and how to make prediction and judgment.The whole process is lacking in explanation.The emergence of interpretable methods improves the output of the model,which "explains" the results of the decision.These interpretation regions,however,provide an advantage against attacks.In this thesis,we find that interpretable methods instinctively provide specific regions for the generation of confrontation samples.In addition,some interpretable methods are generated based on Saliency Map(SM),which quantifies each pixel in the region,and the value of which represents the degree of influence on the prediction results of neural network model.Therefore,it provides a new idea for the generation of counter samples by using the candidate region of constrainting the disturbance and further refining the position of each pixel by significance map.Based on the above investigation,this thesis conducts the following research on the security of interpretability in deep learning:(1)This thesis verifies the feasibility of using saliency graph to counter attack in white box environment.At the same time,considering the diversity of interpretable methods,this paper proposes a dynamic genetic algorithm to generate adversarial samples in black box environment."Dynamic" emphasizes the changing relationship between the number of disturbed pixels and the size of disturbed values.The optimal set of disturbed pixels can be found through gradual approximation,and the addition of disturbances needs the support of multiple rounds of genetic algorithm.Compared with the traditional genetic algorithm,the fitness function is improved in this thesis to guide the generation of adversarial samples through the change of interpretation region.Experimental results show that this method can deceive different neural network models with an average success rate of 92.88% in controllable time complexity.(2)In this thesis,the robustness of interpretable method is enhanced to counter the attack of the above image interpretation region.Firstly,according to the fact that the main curvature of the network model can smooth the saliency graph,the activation function Re LU is replaced by Softplus to effectively reduce the variation of the interpretation region.Secondly,the disturbance set generated by the antagonist is significantly destroyed by adding gaussian noise to the antagonist sample several times in advance using the idea of mean gradient.In the experiment,the influence of hyperparameters β,standard variance and sample number on the interpretation region was discussed in detail.The results showed that the SSIM value of the improved interpretation region increased by 84% and the MSE value decreased by 65% in the face of adversarial samples,which significantly enhanced the robustness of the interpretation region.

Keywords/Search Tags:

Deep Neural Network(DNN), adversarial attack, interpretable method, salience map(SM), robustness

PDF Full Text Request

Related items

1	Attack And Defense Based On Neural Network Image Classification Interpretable Algorithms
2	Research On Robustness And Invisibility Of Adversarial Attacks
3	Research On Adversarial Example Generation Methods In Deep Neural Networks
4	Research On Adversarial Attack Algorithm Based On Deep Neural Network
5	Towards Researching On Heuristic Defense And Robustness Certification For Neural Networks
6	Research On The Robustness Of Deep Image Classification Models Based On Adversarial Examples
7	Research On Adversarial Attack Method For Deep Neural Network Image Classification Model
8	Research On The Robustness Of Siamese Network Tracking Algorithm Based On Adversarial Attack
9	Research On Robustness Of Deep Neural Networks Based On Adversarial Examples
10	Research On Image Adversarial Examples Attack And Application Based On Deep Learning