| HTTP flooding attack is a denial of service attack against Web applications.By simulating normal user messages and taking advantage of massive data generated by HTTP as a cover,the attacker greatly increases the detection difficulty of application layer flood attack.The previous detections based on network traffic can effectively detect the existence of flood attack.Entropy and other statistical methods and the detection algorithm based on machine learning have been successfully applied.However,the mainstream detection methods do not consider the relatively complete feature set evaluation process,the detection model is mostly shallow learning,and the time series information is less considered,and the detection results are lack of interpretation.To solve the above problems,this thesis,based on the large-scale traffic scenario and the characteristics of HTTP flood attack,put forward the interpretability of HTTP flooding flow sequence detection system,and improves the traditional method from three stages:the pre-test feature set in the evaluation,detection model recognition classification,detection attack behavior analysis message,the specific work is as follows:1.Propose the HTTP flood attack feature set extraction scheme for massive data scenarios and the adaptive feature project based on the hybrid combination algorithm of GBDT.In the feature set extraction stage before detection,the detection features satisfying the mass flow scenario in the current research were sorted out from 29 perspectives,and 42-dimensional detection features were proposed from the perspectives of frequency,content,load,time,sequence and derivation.In the adaptive feature engineering,a combination algorithm combining filtering univariate and recursive feature elimination algorithm based on gradient boosted decision tree is adopted to complete the feature evaluation and obtain the important feature subset.Experimental results show that the selection of important feature subset can improve the detection effect compared with the traditional method.2.A deep learning detection model based on temporal convolutional network is proposed.To solve the shallow learning problems of detection model,in the detection phase,through the analysis of traffic behavior time series,a triples aggregation of HTTP source IP,destination IP and timestamp is proposed,and the features aggregation flow series based on the important feature subset will be constructed.After that,input the aggregation flow series into the temporal convolutional network to complete the attack series detection.After the network structure design and the experiment of parameter evaluation,the detection model can perfectly classify the HTTP flood attack,which is better than other detection models.3.Proposed HTTP flooding sequence clustering scheme and behavioral risk scoring model.Aiming at the problem of interpretability of attack detection,a dynamic time warping Kmeans clustering algorithm was proposed to cluster attack messages in the analysis of their behavior after detection,and the key attack sequences were extracted in combination with Shapelets.The centroid and key behavior series of similar attack series are extracted from the experiment,which improving the interpretative detect.A behavioral risk scoring model is proposed,which uses the weight of evidence and information value generated by attack series,and is modeled based on the logistic model.The risk score of behavior is composed of the risk score obtained by the message in each flood feature.Security personnel can search the distribution interval of flood behavior according to score,leading to the detection result more transparent and explainable. |