Font Size: a A A

Research On Enhancement Of Image Adversarial Samples Transferability In Deep Learning Based On Semantic Features

Posted on:2024-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y W TangFull Text:PDF
GTID:2568307079455374Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In recent years,artificial intelligence technology has developed rapidly,and image recognition technology has been widely used in various fields,such as Alipay facial recognition payment,automatic driving technology,scanning pictures and recognizing flowers,etc.However,it has been found that when a slight perturbations that is imperceptible to the human eye is added to the picture,it can mislead the deep neural network model and make the model fail to classify.This process is called adversarial attack.The field where deep learning technology is applied has extremely high requirements for security,and the existence of adversarial examples has caused people to panic.Therefore,it is necessary to discover as many blind spots as possible in the deep neural network,and kill the danger of adversarial examples in the cradle.Since most of the information about the opponent model is unknown,black-box attacks are more realistic than white-box attacks.Attacking the opponent model by using the transferability of adversarial samples is one of the black-box attack methods.That is,to attack the known model,get an adversarial sample,and then use the adversarial sample to transfer and attack the opponent model.This thesis mainly studies it from the perspective of image features,trying to strengthen the transferability of adversarial samples by interfering with the essential features of the image.The main work of this thesis is as follows:(1)A transferable attack method based on the essential characteristics of the same category is proposed to improve the transferability of adversarial examples.Based on information theory,the method uses the objective function to constrain the information transfer between successive deep neural network layers,so as to disentangle the network model for a specific category of pictures,and extract the sub-architecture model closely related to this type of picture.Compared with the source model,using this subarchitecture for feature extraction can reduce the features of non-specific category images and highlight the essential features of specific category images.Perturbating with the essential features of the image can make the perturbations only related to the image,ignore the differences between the models,and improve the transferability of adversarial samples.(2)A transferable attack method based on feature attribution is proposed,which subdivides the image feature characteristics,extracts the essential features of the image as much as possible,and suppresses the "noise" feature to improve the transferability of the image.This method introduces aggregated gradients,which are the average of the gradients of the source model for the feature values of the intermediate layers of this set of random images.This random transformation will suppress model-specific features,preserving the essential features of the image.In the case of non-specific image feature perturbation has been reduced,the non-model-specific feature perturbation is reduced,so that the adversarial transferability of adversarial examples is further strengthened.
Keywords/Search Tags:convolutional neural networks, adversarial attacks, adversarial examples, black-box transfer attacks
PDF Full Text Request
Related items