Research On Phishing Email Identification Method

Posted on:2024-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Xiao

Full Text:PDF

GTID:2558307067973159

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the coming of the information age,E-mail has become one of the important means of communication for modern people.However,at the same time,it also brings a series of email security problems.More and more attackers use email as a carrier to trick users into providing sensitive information or performing malicious operations,resulting in huge financial losses and data leakage risks.Email security events happen frequently.It is of great significance to study email security to improve the level of network security protection and protect users’ privacy.In addition,with the continuous development of artificial intelligence,deep learning and other technologies,deep learning has achieved great success in many fields,but it is still less used in phishing email recognition.Moreover,phishing email recognition method based on deep learning provides better performance and higher recognition efficiency compared with other methods,which is a new trend in recent years.Therefore,this thesis is devoted to the research of phishing email recognition method based on deep learning and the development of phishing email recognition system based on deep learning.The main contributions of this thesis are as follows:(1)Aiming at the problem that the feature representation of phishing email recognition method based on deep learning is not systematic,this thesis proposes a multi-level and multifeature phishing email feature analysis method.This thesis analyzes the text features of phishing emails from four aspects,namely,the character layer character,the logical layer semantic feature,the cognitive layer emotion feature and the character layer URL(Uniform Resource Locator)feature,and proposes the appropriate feature representation method.In terms of word characteristics,an improved TF-IDF(Term Frequency-inverse Document Frequency)method was used to filter feature words.In terms of semantic features,a Word2 Vec word vector model is constructed based on the mail corpus,which can represent the semantic information of Chinese and English words simultaneously.In terms of emotional characteristics,aiming at the deficiency of emotion corpus in phishing email field,we construct emotion text corpus including fear,curiosity and urgency of phishing email.In terms of URL features,aiming at the particularity of URL syntax,N-gram word segmentation and character-level encoding are used to obtain the feature representation.Finally,two new features are proposed:attachment name correlation feature and text correlation coefficient feature.(2)In view of the poor interpretability and robustness of current phishing email recognition models,this thesis proposes a phishing email recognition model based on multi-channel Bi LSTM(Bidirectional Long Short-Term Memory Network)+Attention.This model can input multi-layer features extracted from emails into multi-channel networks for processing and analysis,and introduce Bi LSTM to learn contextual dependencies of text features.At the same time,the model introduces the adaptive Dropout regularization method to improve the model generalization ability.Then,the scaling dot product attention mechanism is introduced to enhance the model’s attention,so that it can identify phishing emails more accurately.Finally,an improved binary cross entropy Loss function,Focal Loss,was introduced to optimize the model for the unbalanced mail data set.The experimental results show that each index of the proposed model is superior to the existing basic model,and the accuracy of the proposed model reaches 98.87% in the mixed data set of Chinese and English.(3)Finally,based on the above two studies,this thesis discusses the application value of phishing email recognition system,and designs and implements a phishing email recognition system based on deep learning.Users can upload email data in a specified format or email EML format file,and the system will process the input data.And output the recognition results and feature attention weight visualization diagram to help users more clearly understand the basis of system recognition results.

Keywords/Search Tags:

Phishing Email Identification, Deep Learning, Emotion Analysis, Attention Mechanism

PDF Full Text Request

Related items

1	Design And Implementation Of Phishing Email Detection System Based On Deep Learning
2	Design And Implementation Of Spear-phishing Email Detection System Based On Email Content Mining
3	Research On Detection Method Of Phishing Web Page Based On Deep Learning
4	Research On Text Emotion Analysis Based On Deep Learning
5	Research On Text Sentiment Analysis Method Based On Deep Learning
6	Research On Deep Learning Emotion Recognition Method Based On Attention Mechanism
7	Research On Emotion Classification Of Texts Based On Deep Learning
8	Design And Implementation Of Phishing Email Detection System Based On Psychological Feature Analysis
9	Research On Aspect Level Emotion Analysis Based On Deep Learning
10	Multi-emotion Analysis Based On Deep Learning And Emoticon Distribution