| With the increasing maturity of deep learning technology,text classification technology based on deep model is widely used in practical tasks,such as harmful information detection,sentiment analysis,etc.However,the results show that the deep learning model is vulnerable to adversarial example attacks,and the attacker can make the target model make incorrect predictions by adding slight perturbation to the example,which reduces the reliability of the model and brings great security risks to the application of text classification models in practice.Therefore,in order to improve the robustness of the model,the problem of adversarial examples needs to be solved urgently.This paper focuses on the adversarial example problem in text classification tasks,and explores it in depth from both offensive and defensive perspectives,aiming to reveal the defects of text classification models based on deep learning and improve the robustness and reliability of the models.The main contents are as follows:(1)In terms of adversarial attack,a high-semantic similarity text adversarial example generation method under black-box conditions is proposed based on synonym substitution.Studying adversarial example generation methods can better understand the inherent flaws of the model and improve the model performance.Most of the existing methods based on word substitution pay more attention to the success rate of the attack,and ignore the importance of unperceptible to the adversarial example,resulting in invalid attacks.In order to solve this problem,this paper focuses on the imperceptibility of adversarial examples,selects the best synonymous substitution for the words in the example based on the change of model classification confidence and the semantic similarity between examples,and also adds semantic constraints in the word substitution process to preserve the semantics of the original example to the greatest extent.Experimental results show that the adversarial examples generated by this method have high semantic similarity while ensuring the success rate of attack,which improves the above problem.(2)In terms of adversarial defense,for synonym substitution attacks,an adversarial example detection correction defense method based on high-frequency synonym substitution is proposed by adding add-ons.Existing defense efforts have the problems of reducing the classification performance of the model for clean examples,defending against only a specific attack method,and not defending well.In this paper,a disturbance detector is used to identify the examples with disturbances,filter the clean examples,and correct only the disturbance examples to achieve the purpose of not affecting the output of clean examples,combine synonym substitution and word frequency influence to select alternative words for the perturbed words,effectively restore the adversarial examples,make the model output correctly labeled,and improve the robustness of the model in the face of such attacks.Experimental results show that this method can effectively defend against synonym substitution attacks,reduce the impact on clean examples,and outperform other baseline methods.Figure [14] Table [12] Reference [69]... |