| With the continuous advancement of traffic encryption technology,many malicious traffic also hide their attack attempts by means of encryption,so the detection of encrypted traffic is urgent.The following challenges are still faced in the field of encrypted traffic detection: 1)the unbalanced features of encrypted traffic datasets are difficult to be mitigated;2)deep learning-based methods have the disadvantages of insufficient feature acquisition and feature loss due to overly deep models.To address these issues,this thesis investigates encrypted traffic detection methods from the perspectives of both data sampling and model optimisation,with the following main work.1)A hybrid sampling algorithm(KPMS)based on K-means++ clustering is proposed for the imbalance of encrypted traffic samples.An undersampling algorithm is designed for the majority class samples and an oversampling algorithm is designed for the minority class samples.Comparative experiments are conducted on two publicly available datasets,and the results show that the sampling results of the KPMS algorithm can effectively fit the sample distribution characteristics of the original dataset.2)To address the problem of insufficient feature extraction for malicious samples of encrypted traffic and the loss of feature information due to the deepening of the model network layers,the encrypted traffic detection model De MSC-Bi GRU is proposed based on feature map fusion and short-circuiting mechanism.The feature loss phenomenon is alleviated and the characterisation capability of the traffic samples is improved.The fusion of Bi GRU networks enhances the ability to extract sequence features.The average F1-score is 98.71% when validated on two publicly available datasets after balancing the KPMS algorithm,which is an improvement of about 3% compared to existing work.In summary,the KPMS hybrid sampling algorithm and traffic detection model De MSC-Bi GRU proposed in this thesis can effectively alleviate the imbalance characteristics of the encrypted traffic dataset,reduce the misclassification and omission of a few classes of malicious samples,and improve the detection rate of malicious samples in encrypted traffic. |