Font Size: a A A

Application Of Machine Learning Algorithm In Webshell Detection And Security Research

Posted on:2023-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhangFull Text:PDF
GTID:2568306836469614Subject:Cyberspace security
Abstract/Summary:PDF Full Text Request
Webshell is a backdoor file left by hacker after successfully infiltrating a website server.This type of backdoor file is not easy to be found,and can make hacker browse key files of the website,and even modify and delete important files in the database,which is a great threat to network security.A more efficient webshell detection method enables network security operation and maintenance personnel to better maintain the normal operation of website.Traditional attack detection methods are no longer enough to deal with the current complex network environment.Using webshell detection method based on machine learning is a better choice.At present,there is a problem that feature extraction still needs to be optimized in the webshell detection scheme based on machine learning,which makes the accuracy of webshell detection need to be further improved.To this end,this paper proposes a Webshell Detection scheme based on Fine-graineD Features(WD-FDF scheme).In order to eliminate the interference of encryption and obfuscation technology,the opcode is extracted,and natural language processing method is introduced to analyze it,implement dynamic feature vectorization,and then combined with static statistical features to form a 107-dimensional fine-grained feature vector set,providing optimized data for training machine learning algorithms.Finally,the performances of different algorithms are compared to select an appropriate algorithm to form a complete detection scheme.Simulation results show that the scheme can accurately distinguish between normal web pages and webshells,and the accuracy rate is as high as 99.574%.Based on WD-FDF scheme,this paper designs and implements WebshellHunter——a webshell detection system.On the basis that the traditional rule matching method can accurately detect known attacks,the system additionally introduces a machine learning detection model to solve the detection problem of unknown attacks.After offline training and testing of the optimized dataset,the model has good distinguishing ability and generalization,and can make accurate and quick judgments on attack detection online.In addition,the system provides a user graphical interface,supports user-defined scanning paths,and provides a complete detection report to help users analyze the specific types of webshells.The final system test results prove that the system has excellent detection rate and false alarm rate,good user experience and popularization and application value.Aiming at the security vulnerability of the machine learning algorithm itself,this paper studies the data level of the machine learning model,and takes the K-means algorithm as the research object,and proposes a poisoning attack scheme based on boundary data.Firstly,the related characteristics of K-means algorithm and its existing security problems are analyzed;Secondly,by using the idea of "data drift" and the characteristics of K-means algorithm,a poisoning attack based on boundary data between clusters is designed by observing the cluster labels and loss function;Finally,the attack scheme is combined with the application scenario of network attack detection.The experimental results show that the attack scheme causes false positives of the K-means algorithm model,and the false positive rate is as high as 16.75%,which has a great impact on the availability of the K-means algorithm.
Keywords/Search Tags:Machine Learning, Web Attack Detection, Webshell, Poisoning Attack, K-means Algorithm
PDF Full Text Request
Related items