Font Size: a A A

Research And Implementation Of WebShell Feature Reduction Methods Based On Neighbourhood Components Analysis

Posted on:2022-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:A J ZhouFull Text:PDF
GTID:2518306539998169Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
At present,network security has been upgraded to a national strategic level.Web Shell is one of the main threats in the field of network security.With the application of high technology,various types of network backdoors emerge one after another,and Web Shell detection has become one of the important research objects in the field of network security research.In recent years,new progress has been made in the research of Web Shell detection in the source code layer and the source code compilation result layer,which effectively improves the detection ability of unknown Web Shell.In the process of constructing Web Shell text features,attention should be paid to preserving Web Shell text information as much as possible.The higher the information completeness,the greater the detection probability of different types of Web Shell.However,a large amount of text data and its features will lead to information redundancy,misjudgment of results,high calculation cost and so on.Dimension reduction is an effective method to deal with massive data,which can avoid noise data interference from high-dimensional features and mine key information.Therefore,it is necessary to study a feature dimension reduction method which is beneficial to classification,mine effective information and obtain key components of text representation.This paper introduces Neighbourhood Components Analysis(NCA)in metric learning,and studies the application of NCA in Web Shell feature dimension reduction from two angles of feature extraction and feature selection.The main work and contributions can be summarized as follows:1)It is determined to focus on the study of Web Shell written in PHP language,and take the opcode sequence characteristics of Web Shell source code compilation layer as the research object,and construct a feature index model suitable for Web Shell.Among them,aiming at the problem that the context consideration of opcode sequence features is not comprehensive enough,this paper proposes to preserve the context information of opcode by using unfixed-length sequence fragments.2)A Web Shell feature extraction method based on NCA_Relief F is proposed,which alleviates the problem that the number of opcode sequence features explodes and the Web Shell detection effect is not ideal.NCA_Relief F method has two characteristics:first,it can effectively combine the accuracy of classifier to automatically adjust the projection matrix,and complete the reduction of high-dimensional features while ensuring high accuracy;Secondly,according to the feature difference expression of nearest neighbor samples in low-dimensional space,the features with category discrimination ability can be selected.Experiments show that the feature components obtained by Web Shell feature extraction method based on NCA_Relief F can effectively detect Web Shell,and the recall rate is better than most traditional feature dimension reduction algorithms.3)A Web Shell feature selection method based on Regularized Neighbourhood Component Analysis(RNCA)is proposed,which solves two problems.First,to make up for the defect of poor feature interpretation of NCA_Relief F method,the method selects features that are beneficial to classification from opcode sequence feature sets through iterative learning of weight vectors,and realizes feature dimension reduction.Secondly,it solves the problem of low Web Shell detection performance caused by feature simplification.Methods the completeness of features was increased from a statistical point of view,and the combination of static statistical features and behavioral sequence features was used to judge whether the samples to be detected were Web Shell.Experiments show that the accuracy of feature selection method based on RNCA can reach 99%.WebShell detection methods are diverse and need to be selected according to different requirements.If it is necessary to build a model quickly to complete the prediction,NCA_Relief F method is used to complete the feature dimension reduction,and a large number of Web Shell samples can be retrieved.If the accuracy of sample recognition is required to be high,the training time can be ignored,and the feature selection method based on RNCA can complete the detection task with high quality.
Keywords/Search Tags:WebShell detection, feature reduction, opcode, Neighbourhood Components Analysis
PDF Full Text Request
Related items