Research And Implementation On Joint Features And Intelligent Detection Algorithms Of Phishing Webpages

Posted on:2019-02-20

Degree:Master

Type:Thesis

Country:China

Candidate:X P Jia

Full Text:PDF

GTID:2428330545457853

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Phishing Webpage fraud is a major trick of criminal in the modern Internet world.In recent years,the number of webpage attacks has been rising significantly,and hit a record high in 2017.Attackers can deploy a webpage attack at the lowest cost,and allowing it to spread on a large scale in a short period of time.In order to protect the information security of Internet users,it is crucial to study more accurate and rapid automatic webpage detection methods to resist this fast-paced cyber attack.In this dissertation,the classification of phishing webpages was investigated using features derived from three sources: URL,web content elements as well as relative informations,and feature extraction,feature selection and feature importance calculation are performed on these features.In order to make the classification models express richer fine-grained description of web pages,the joint feature rate R(0<R<=1)was introduced to feature extension and combination of basic features.Based on these,a variety of basic classification models are implemented,and the capabilities of multiple models trained using different dimensional features in the detection of phishing webpages are systematically compared.First of all,the optimal parameter models were obtained by adjusting the parameters of multiple classification models,and compare the classification results of multiple optimal parameter models trained based on different joint features.Secondly,the optimal classification model was compared and determined from the respective optimal parameter models.In addition,the selected optimal parameter model is compared with the existing related research results.The results show that the random forest and neural network model has excellent detection effect.In this paper,an improved self-training method of semi-supervised learning was proposed.This method divides a large number of unlabeled datasets into multiple subdatasets on average,and sequentially trains the classification models on these subdatasets.The detection accuracy rate and the running time of the four common classification models in the improved self-training method were compared.Compared with the traditional self-training method,the improved self-training method can detect phishing webpages effectively,and also on the basis of ensuring that the classification effect is equal to the traditional one,and the running time of the method is reduced by more than 50%,which provides a new idea for solving the lack of large scale data with reliable label and online detection.

Keywords/Search Tags:

Phishing webpage detection, Machine learning, Joint feature, Optimal classification model, Self-Training

PDF Full Text Request

Related items

1	Research On Phishing Webpages Detection Based On Machine Learning
2	Research On Phishing Detection Model Based On Improved TCD Image Retrieval And Classification
3	A Phishing Website Detection Method Based On Stacking Model
4	Machine Learning Based Malicious Webpage Analysis
5	Research On Phishing Website Detection Based On Data Mining Classification Algorithm
6	Research On Phishing Detection Based On Feature Label
7	Research On Phishing Website Hierarchical Detection Based On Webpage Features
8	Research On A Method For Phishing Webpage Detection Based On DOM Structure Clustering
9	A Phishing Detecting Method Based On The Relationship Between The Web-pages
10	Phishing Detection Technology Based On URL And Web Page Features