Font Size: a A A

Adversarial Example Detection Methods For Deep Learning Image Recognition Models

Posted on:2023-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:S X ChenFull Text:PDF
GTID:2568307070483824Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,Deep Neural Networks(DNNs)have achieved great success in computer vision,speech recognition,and natural language processing due to their powerful adaptive learning capabilities.However,recent studies have shown that DNNs are vulnerable to adversarial examples(AEs),which are generated by adding humanimperceptible perturbations to the normal examples.Such examples will lead to decision-making errors of DNNs with great probability.The existence of AEs brings serious security risks to applications based on deep learning.Therefore,the research on AE detection is of great significance for artificial intelligence security.In order to reduce the potential harm caused by AEs,researchers have proposed a series of AE detection methods.However,the existing detection methods have shortcomings,such as the inability to balance performance and efficiency and the lack of flexibility to generalize from one attack to another attacks.Based on a comprehensive evaluation of the performance and efficiency of existing detection methods,this paper proposes an AE detection method based on random sampling and an AE detection method based on the discrepancy of class label sequence.The specific research contents are as follows:(1)We conduct a comprehensive study on the performance and efficiency of five mainstream AE detection methods(SPBAS,ML-LOO,KD+BU,LID,and MAHA),against AEs generated by five common adversarial attacks(FGSM,C&W,BIM,JSMA,and Deep Fool),on four benchmark datasets(MNIST,CIFAR-10,SVHN,and CIFAR-100).The research results show that the performance of detection methods considerably depends on the properties of datasets and the characteristics of adversarial attacks,and existing detection methods fail to achieve a satisfying trade-off between performance and efficiency.(2)We propose an efficient AE detection method based on random sampling aiming at the inefficiency of feature attribution-based detection methods.First,the K-S test is adopted to prove that even if only part of the pixels in the image are processed,there is still a significant difference in the feature attribution distribution between the normal image and the adversarial image.Therefore,the detection performance will not be affected.Then,various sampling strategies are utilized to conduct extensive experiments against commonly used adversarial attacks on four datasets,verifying that the accelerated method can remarkably improve detection efficiency while maintaining the performance comparable to the existing mainstream methods.For example,using the method proposed in this paper on the CIFAR-100 dataset can improve the average detection efficiency by 5.56 times.(3)We propose an AE detection method based on the discrepancy of class label sequences aiming at the problem that the existing detection methods lack the flexibility to generalize from one adversarial attack to another adversarial attack.The critical insight is that the class label sequence discrepancy of adversarial examples is much higher than that of normal examples.First,an input example is transformed by masking pixels,and the example and its transformed examples are input into the DNN.A set of random forest models fitted on the activation layers of the DNN are then employed to derive class label sequences for these examples,and the discrepancy between each transformed example label sequence and the original example label sequence is calculated.Finally,a logistic regression classifier is trained by virtue of Label Sequence Discrepancy(LSD)for AE detection.Compared with mainstream detection methods,LSD achieves state-of-the-art performance in detecting AEs.In particular,LSD can effectively detect AEs with different confidence levels and generalize well between different attacks.
Keywords/Search Tags:Deep Neural Network, Adversarial Examples, Random Sampling, Label Sequence Discrepancy
PDF Full Text Request
Related items