| With the rapid spread of encryption technology in cyberspace,the encrypted network traffic generated as a carrier of network information has also begun to grow rapidly.Although encryption technology protects network information security through ciphertext communication,attackers also use the concealment of encryption technology to evade detection by traditional detection systems,thus bringing a network security threat that cannot be ignored.Therefore,how to effectively analyze the encrypted network traffic has become an urgent problem to be solved.This thesis combines two scenarios of anomaly traffic identification and application traffic identification to study the key technologies of encrypted network traffic analysis.This thesis applies Light GBM algorithm to encrypted network traffic analysis for the first time and achieves good results.Based on the research of anomaly traffic identification and application traffic identification method,an encrypted network traffic identification scheme named MGST-ETA is designed,which not only identify anomaly traffic and application traffic with high precision and efficiency,but also protect user privacy during the identification process.For anomaly traffic identification,this thesis first analyzes the characteristics of anomaly traffic in packet payload granularity,network traffic granularity and session connection information granularity.Then,based on Light GBM algorithm,this thesis proposes a binary classifier for anomaly and normal traffic and a multi-class classifier for malware families.By a number of experiments,a traffic granularity feature set with good recognition accuracy is determined.Then,with the support of comparison experiments,it is verified that Light GBM can effectively resist the imbalance of classifier precision caused by imbalanced data sets at the algorithm level.At the same time,this thesis uses the Borderline SMOTE-2 algorithm to optimize imbalanced data sets at the data level.After optimization,the average precision rate is improved by 6.16%.For application traffic identification,this thesis first analyzes the traffic characteristics of different application network traffic in time domain and frequency domain,and proposes the characteristics set of spatio-temporal features with mixed granularity based accompanied traffic characteristics.Then,we propose Light GBM-Boruta feature selection algorithm and the multi-level LightGBM classifier algorithm based on feature importance feedback.The experimental results show that the classifier algorithm achieves the recognition accuracy comparable to the current optimal method on the published traffic dataset,and its average precision is over 91%,and the algorithm can also identify the application traffic of unknown application type and specific application type.Based on the previous two methods,this thesis propose an encrypted network traffic analysis scheme named MGST-ETA based on multi-granularity spatio-temporal features.The scheme is designed and implemented based on the mirror switch-based traffic collection system,the multi-thread parallel architecture-based preprocessing toolset and traffic analysis visualization function is designed and implemented.Finally,the deployment and testing of the scheme are carried out in the real network environment.The test results show that the proposed scheme has high detection rate,low false alarm rate and low false negative rate in the anomaly identification and application identification tasks.Compared with other typical encrypted network traffic analysis schemes,the scheme has characteristics of early detection and fine-grained identification. |