Font Size: a A A

Research On Phishing Detection Mechanism By Integrating New URL Features

Posted on:2018-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:D D SunFull Text:PDF
GTID:2348330518999219Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Networking security issues emerge in an endless stream along with the increasing development of Internet technology. Phishing is a typical online fraud, which obtains user's sensitive information by disguising as a legitimate website through the Internet.The attacked user will have different levels of information disclosure, resulting in economic losses and personal privacy exposure. How to detect phishing websites quickly and accurately has become the focus of Web information security research.Considering that some common features can not effectively distinguish the new phishing websites,and a few phishing detection methods based on comprehensive features are low efficient, this thesis puts forward a kind of light-weighted hierarchical detection mechanism.Firstly,statistical analysis was performed on more than 20 thousand samples of URL sample ,this thesis conducts deep research on URL features from two aspects: URL structure and WHOIS information. At the same time, a new URL feature set is developeded through feature selection methods. An algorithm of unusual brand based on Levenshtein distance is proposed, and a collection of suspicious words of phishing sites is constructed by generalized suffix tree.On this basis, this thesis proposes an improved decision tree algorithm as the classification model, this algorithm sets threshold to judge the reliability of results. As to the low credibility URL sample,page features are extracted for final detection. Compared with the URL features, the page features need to obtained by analyzing the content of the web page and extract more complicated. About spoofing behavior of page features,this thesis summarizes serveal common spoofing methods and deals with it when page features extracting, which avoids the spoofing features influence on results. Because of the high dimension of the page feature,SVM algorithm is used as the classification model, and uses the genetic algorithm to optimize the parameters of the SVM.The hierarchical detection mechanism in this thesis can be divided into two stages:the URL features based detection and the page features based detection, only a small number of samples with low accuracy of classification results need to integrates the page features, which improves the phishing detection efficiency. In order to verify the effectiveness of the classification detection mechanism and representation of feature set, comparison experiments are carried out. The experimental results show that the proposed method achieves higher accuracy.
Keywords/Search Tags:Phishing Websites Detection, URL Features, Multiple Features Integration, Spoofed Feature, Cascading Detection
PDF Full Text Request
Related items