| As malware in the network uses the SSL/TLS protocol to encrypt the content of its traffic,it becomes more difficult to detect malicious traffic.Due to the small number of public datasets for malicious encrypted traffic,it is difficult for traditional detection methods to extract effective encryption characteristics,resulting in the low accuracy of traffic classification by a single detection model,and the online detection of malicious encrypted traffic cannot be realized.Therefore,malicious encrypted traffic detection has become an urgent problem to be solved.Relying on the national key research and development program topic "Identity-based Trusted Protocol and Malicious Communication Behavior Monitoring Method",this paper extracts the malicious encrypted traffic feature set by constructing a malicious encrypted traffic dataset,and uses offline training online detection method to achieve malicious encrypted traffic detection.This article works as follows:1.Generate malicious encrypted traffic datasets.The Cuckoo sandbox is used to create a virus sample running environment,collect the virus samples Ofowgin,Charger,Fake AV and Fake Inst in the four types of common advertising,ransomware,intimidation and e-spam,and use Wireshark to crawl malicious encrypted traffic in the environment and generate malicious encrypted traffic datasets.2.Obtain a representative set of malicious encrypted traffic characteristics.The Feature Extraction Module is used to build a feature extraction module using Joy combined with Python scripts to extract 21 encryption features and 35 spatio-temporal features,effectively expanding the signature set of malicious encrypted traffic.Combining pearson correlation coefficient,L2 regularization,RF average impurity reduction,and XGBoost feature selection method,the features that appear more frequently than three times in the above feature selection methods are selected,and the redundant features are excluded to obtain a malicious encrypted traffic feature set consisting of a total of 25 features containing 5 encryption features.3.The design realizes an XGBoost+ LSTM secondary detection model with an offline detection accuracy rate of 94.32%.A variety of performance indicators are used to compare different single models,and the best performing models are selected to form a secondary model.The second-level detection model first uses the XGBoost detection model to classify normal traffic and malicious encrypted traffic,and then uses the LSTM classification model to classify malicious encrypted traffic in fine grain.4.Deploy an online testing experimental environment to verify the detection capability of the XGBoost+LSTM secondary detection model.The normal traffic and four malicious encrypted traffic in the 5G environment are simulated in the detection system,and the XGBoost+LSTM secondary detection model is deployed to detect and classify the traffic in the online detection system.Experimental results show that the accuracy rate of the detection model can reach 93.33% and the detection rate of malicious traffic reaches 99.2% under the time window of 120 s. |