| Access superpoints are hosts that interact with a large number of peers in the network at the same time.They generally play an important role in the network,such as servers,proxies,scanners,hosts attacked by DDo S,worm propagation sources,etc.Although the number is small,it occupies a number of network communication resources.Effectively detecting access overpoints in real time and dynamically monitoring their traffic behaviors to catch abnormalities in time is of great significance to network security and network management.This thesis studies the method of network anomaly detection from the perspective of access superpoint behavior analysis.The main contributions of the thesis are reflected in the following aspects:1)Based on the network environment of 40 G bandwidth,the statistical algorithm based on sampling aggregate flow as the data source and the estimation algorithm based on all packets as the data source were compared to obtain the accuracy of supepoint detection.Among them,the statistical algorithm of sampling flow records is improved by the threshold model based on heavy tail distribution.The estimation algorithm is implemented in GPU environment using SRLA algorithm.The results show that the estimation algorithm is significantly better than the statistical method of sampling flow records after compensation.Finally,based on the estimation algorithm,in a 40 G bandwidth network boundary access superpoint real-time detection system is realized.2)According to the fact that different degree of activity of the access superpoints have the different focus in anomaly detection,a sliding-window-oriented time-frequency classification algorithm of the access superpoints is proposed.According to the time-frequency attributes,the access superpoints are divided into three types of superpoints:high-frequency superpoints,medium-frequency superpoints,and low-frequency superpoints.The data from real network environment confirms that the above algorithm can accurately classify the detected access superpoints by time-frequency attributes in real time in a high-speed network environment.3)For low-frequency access superpoints,an anomaly detection method based on frequent items-rules is proposed to detect DDo S attacks and horizontal port scanning based on the unique behavior pattern of anomalies,and based on the similarity between P2 P behavior and anomalies behavior,filter P2 P before anomaly identification,and build a P2 P low-frequency superpoint binary classifier in support of flow record data sources based on machine learning classification algorithms to achieve P2 P filtering.The data from real network environment confirms that filtering P2 P before anomaly identification can further improve anomaly detection Accuracy.4)For high-frequency access superpoints,a peer count anomaly detection algorithm based on predictability analysis is proposed under the condition of only obtaining the number of connected peers,and the Holt-winter seasonal model,SVR algorithm and LSTM neural network algorithm are respectively implemented to compare predictive effect based on the time series data of number of connected peer.The experimental results show that the LSTM neural network algorithm has the best prediction effect on the time series data of number of connected peer of high-frequency superpoints.Finally,based on the LSTM neural network algorithm,the anomaly detection system on number of connected peer for high-frequency superpoints is realized on the border of the access network of 40 G bandwidth.5)For high-frequency access superpoints,a real-time detection model for high-frequency super-point anomalies is designed based on the isolation forest algorithm under the condition that full traffic is available,and an abnormal analysis method based on key feature mining is proposed,which is proved to be effective on large-scale network boundaries.6)Based on the machine learning classification algorithm,a two-classifier is constructed for the two abnormal types of high-frequency super-point vertical scanning and flood attack.In order to reduce the over-fitting caused by the lack of abnormal samples,the self-training classification and semi-supervised learning algorithm based on SVC is used to expand the samples and then the expanded anomaly sample data set is used to verify the classification effect.Experiments have proved that the constructed classifier can achieve effective automatic classification of high-frequency access superpoint vertical scanning and flood attack. |