Font Size: a A A

An Efficient Approach To Generate Text Adversarial Examples Via Words Replacement

Posted on:2023-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:X J WangFull Text:PDF
GTID:2568307064470434Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Adversarial examples are inputs that are deliberately designed by attackers to deceive deep neural networks(DNNs),and their existence poses a serious threat to the security of DNNs.In order to reveal the nature of the inherent defects of DNNs and improve the security and robustness of deep learning models,it is imperative to carry out research on adversarial sample generation methods.Most of the existing state-of-the-art adversarial sample generation methods are based on word replacement.Due to the need to determine the priority of word replacement,previous researches on generating adversarial samples based on word replacement mainly use a deletion scoring strategy to determine the priority,which requires frequent access to the target model.This leads to problems such as poor concealment and low interpretability of attack methods.In view of the above problems,this paper improves the word scoring strategy and proposes an efficient text adversarial sample generation algorithm based on word replacement.The specific contents are as follows:(1)In order to minimize the number of visits to the target model,this paper proposes an efficient word-level adversarial example generation algorithm(Efficient Words Generation Adversarial Example,EWGAE).The algorithm combines the attention mechanism and the target model decision to score words,and under this scoring strategy,the replacement priority of words is determined,and then synonym replacement is used to generate adversarial samples.Under the black-box model,the algorithm first uses the attention mechanism to calculate the importance scores of all words in the original text;secondly,the change value of the decision probability of the target model is used as the influence score of the synonymous replacement word;finally,the two scores are combined as the word grading basis.The experimental results show that this method can greatly reduce the access frequency of the target model at the expense of the perturbation rate of some words.Compared with other methods,the algorithm can make the classification accuracy of the target model after attack lower.(2)In order to further understand the model decision and reduce the word perturbation rate,this paper proposes an efficient text adversarial example generation algorithm based on interpretable model and locality-sensitive hashing(LSH).In the black-box mode,the algorithm first uses the interpretable model to calculate the importance scores of all words in the original text;secondly uses LSH to capture the impact scores of synonymous replacement words on the prediction results of the target model;then combines the two scores as words The sorting basis of;finally,synonymous replacement is performed according to the score from high to low,and then the adversarial attack is realized.Compared with the baseline method,the effectiveness of the algorithm is verified,and the word modification rate of the algorithm is slightly reduced compared with EWGAE.Figure [19] Table [7] Reference [74]...
Keywords/Search Tags:text adversarial attacks, black-box attacks, Deep Neural Networks, natural language processing
PDF Full Text Request
Related items