| Applications based Web typically rely on the back-end database server to manage the specific persistent state,and extract data by executing queries that are provided by the user.If you do not properly handle the user’s request,the attacker would change the structure of the SQL statement.The server may suffer from SQL injection attacks based on Web applications,and even endanger the security of the system database.This thesis describes the characteristics and hazards of SQL injection attacks,and explores the novel methods of machine learning for the detection and recognition.To solve the problem that the current security company adopts the rule base to match the detection,it is necessary to establish the specific detection rules,and it can not discover the new and more hidden injection behavior in time,resulting in the problem that the update rules are not timely,the problem of real-time detection degradation is also improved.The main research contents are as follows:1.According to the comprehension of the existing data and further understanding of the research objective,a series of operations on data are performed such as similarity removal,URL decoding,string splitting,similar field generalization,to conduct data processing for constructing machine learning features.2.Based on the understanding of the data and the business,the original data is constructed from three aspects: the visitor,the visitor,the Url field.3.The monitoring model of supervised learning algorithm is designed.Through the experimental comparison,targeted two-times sampling is selected for normal data,and two-times trained ensemble learning method based on the idea of incremental training is used for each machine learning model,in which the unbalanced data of the existing problem is improved.And then we combined Bayesian and decision tree in the final plan which improve accuarcy of the SQL.Detection.4.The model proposed in this thesis has the following advantages: it can not only detect the known threat,but also detect some unknown such as the threat that regular expression rule engine can not match.However,the method encounter the problem of time overhead with large amounts of data.Therefore,This thesis introduces the method of constructing the white list database by using the unsupervised statistical learning method which can facililate the rapid recognition of a large number of normal data and avoid the excessive time overhead caused by rush detection.The high accuracy of the current model is proved by the rich experiment,and finally,system for detecting SQL injection based on machine learning algorithm is designed and implemented. |