Font Size: a A A

Design And Implementation Of Spear-phishing Email Detection System Based On Email Content Mining

Posted on:2023-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:M Y ChengFull Text:PDF
GTID:2558306914972129Subject:Computer technology
Abstract/Summary:PDF Full Text Request
At present,email has become an indispensable communication tool in people’s daily work.Due to the wide application of email,attackers often use email as the carrier of malicious links and malicious attachments to induce the target to click,so as to implement malicious behavior.Compared with the attack directly through malicious code,phishing email often does not need to use specific technical vulnerabilities,but usually takes social engineering as the core.It only needs to send email to implement the attack,and the attack cost is low,which makes the detection of phishing email attack of great significance.Spear-phishing is a special form of phishing,which is the most harmful to the victim.Compared with the "wide spread" ordinary phishing email,spear-phishing email has the characteristics of "target accuracy","attack persistence","camouflage concealment" and "severity of damage".Before the attack,the attacker will use social engineering technology to continuously investigate the email recipient or organization,and customize the email content according to the recipient,which makes the success rate of spear-phishing email attack very high.At the same time,spear-phishing email attacks usually target members of important organizations such as government,companies,enterprises and authoritative institutions.Once the attack success,it will cause serious consequences.The automatic detection of spear-phishing email attack mostly continues the methods of ordinary phishing email,mainly including rulebased detection,sandbox detection,machine learning and deep learning.However,spear-phishing email attackers tend to be more cautious,and the email sent will also have higher camouflage and deception characteristics,so as to bypass the first two detection methods;at the same time,due to the low attack frequency of spear-phishing email,the machine learning and deep learning detection methods that need a large number of data samples as datasets cannot be used well.Therefore,aiming at the difficulties of spear-phishing email detection,this paper designs a spear-phishing email detection method based on email content mining.The main research results are as follows:(1)This paper presents a method of phishing email mining and analysis.Aiming at the problem that the boundary between spear-phishing mail and ordinary fishing mail is not obvious in terms of characteristics,this paper designs a phishing email detection method based on k2LSTM model.Firstly,K-means and KNN algorithms are used to expand the manual marking samples on the complete dataset,and then the expanded samples are used as the training set in the LSTM model which is used to learn the language style and wording habit of the email body,and screen out the phishing emails in the dataset.This method effectively solves the problems of large amount of email data and complex content types in the real production environment,and screens the phishing emails in the data set to the greatest extent.(2)An analysis method of potential harpoon phishing email attack is proposed.According to the characteristics that spear-phishing email attackers need to investigate the target and carefully construct the email content,this paper puts forward the concept of attack cost of email.This paper believes that attackers will introduce a large amount of attack cost when constructing emails,and the more likely it is that the mail with high cost is spear-phishing email.This paper evaluates the email from the two dimensions of forgery fraud cost and analysis of attack target cost,sorts the email cost,and analyzes the potential spear-phishing attack.(3)A detection method of spear-phishing email for specific attack organization in small sample environment is proposed.Aiming at the situation that there are a small number of spear-phishing email samples with clear organization attribution,this paper proposes a spear-phishing email organization classification method based on hierarchical concatenated pooling,which processes the email body through the simple word-embedding model,greatly reduces the parameters of the model,so as to reduce the dependence of the model on the scale of the training set.Then our method combines the key characteristics of the email and the classical machine learning model is used as the classifier to complete the task of organizing and classifying spear-phishing email.(4)A spear-phishing email detection system is developed.In order to verify the effectiveness of the method proposed in this paper,the related attack detection system is designed and implemented,and its overall design process and modular details are described in detail.Finally,through designed experiments,the effectiveness of the proposed method is verified,the influence of the selection of algorithms and parameters in the model on the proposed method is discussed,and the proposed method is significantly improved compared with other solutions.
Keywords/Search Tags:spear-phishing email, attack cost of email, few-shot learning, word-embedding model
PDF Full Text Request
Related items