| Adversarial examples refer to examples formed by artificially adding subtle disturbances invisible to the naked eye in the original dataset.Such examples will cause the trained model to give wrong classifications with high confidence.The smallness of adversarial disturbances,the mobility of adversarial examples,and the reality of adversarial attacks make adversarial examples a great threat to security.Even though the attack success rate of some adversarial examples reaches 100%,which leads to the complete inability of the target model to classify accurately,thus exposing the vulnerability of the neural network.Neural network models have been widely used in various industries,so it is of great practical significance to propose different ways to enhance the robustness of neural network models to ensure that they are available,accurate and safe.In the field of deep learning image classification,feature squeezing not only significantly enhances the robustness of the model,but also preserves the accuracy of legitimate inputs,thereby providing an accurate detector for static adversarial examples.In addition,compared with many detection methods derived from changing the structure of the original classifier,this approach improves the shortcomings of the original classifier,such as lower classification performance for clean examples or more computational resources consumption.However,this approach still suffers from a high false positive rate.To this end,based on statistics,this paper studies the threshold setting of the feature squeezing method.The analysis shows that the Manhattan distance and Gini impurity changes of adversarial examples squeezed and unsqueezed are different from those of legitimate examples.According to the analysis,this paper defines the weighted sum of the absolute value of Manhattan distance and Gini impurity difference as the squeezing score,and it is used as an indicator to measure how close the unknown example is to the adversarial example,and its maximum and average values are used as two detection basis for comparison with threshold.By comparing the effects of different statistical indicators and their methods on the feature squeezing adversarial detection method,a joint detection method of feature squeezers based on squeezing score(SSMF for short)is proposed,which first uses a feature squeezer to remove unnecessary features;then uses the squeezing score in the image classifier as a detection indicator,and uses its maximum and average values.The value is compared with the threshold to determine whether the input example is an adversarial example.This paper is empirically conducted on the MNIST and CIFAR-10 datasets.The results show that under the condition of approximate accuracy,compared with the traditional feature squeezing method that only uses the maximum value of Manhattan distance as a single detection basis,the false positive rate of the method based on the maximum squeezing score respectively are 0.5% and 1.55% lower than the latter;the false positive rate of the method based on the average squeezing score respectively are 0.5% and 0.2% lower than the latter.This demonstrates that the detection scheme constructed in this work effectively reduces the false positive rate of traditional feature squeezing detection methods,helps to improve the robustness of related models,and has reference significance and application prospects for the detection of adversarial examples.The results obtained in this paper are only preliminary,and there are still improvements in algorithm framework design and experimental settings,such as introducing randomness and enriching data sets. |