| With the popularity and application of encryption protocols,network traffic encryption technology has made a breakthrough in data security and privacy security.However,the opacity of encrypted traffic makes the precise management of encrypted traffic require a lot of computing costs.How to quickly,effectively and accurately identify encrypted traffic has become a major research hotspot and difficulty in the field of network security.Because the characteristics of traffic have changed after encryption,traditional traffic detection methods are no longer applicable in encrypted environments.At present,the encrypted traffic features extracted by the machine learning model are mainly trained and identified.However,the unbalance of network traffic distribution and the difficulty of encrypted traffic collection make the machine learning model have low ability to identify a few classes of traffic with insufficient number of samples in the dataset,which will seriously reduce the overall recognition accuracy and generalization ability of the model.To solve these problems,the main work and research contents of this paper are as follows:(1)In this paper,the encrypted traffic identification technology is summarized into the identification method based on machine learning and the identification method based on cryptography.Malicious traffic identification based on machine learning extracts the malicious features of encrypted traffic,constructs the malicious feature set,and inputs it into the training model as the training set.The ideal accuracy is obtained through model design and parameter optimization.The malicious traffic identification based on cryptography uses cipher text retrieval technology to retrieve malicious keywords on encrypted traffic and identify malicious traffic by deeply integrating searchable encryption technology,traffic review mechanism and provable security model.(2)The current unbalance problem of encrypted traffic dataset severely restricts the upper performance limit of machine learning model.Considering the particularity of encrypted traffic identification task,the problem of unbalance of dataset should be solved,at the same time,the challenge of encrypted traffic identification technology in computing cost and time cost should be addressed.Therefore,this paper designs and implements a small sample dataenhancement technology based on adaptive sampling,uses the adaptive sampling algorithm to enhance and expand the small sample traffic data,and constructs a distributed and balanced encrypted traffic dataset.The experimental results show that the proposed scheme achieves good results on public datasets,the overall recognition accuracy after sampling reaches 99%,and the evaluation index on small samples exceeds several existing schemes.(3)Generation-based antagonistic network(GAN)has been introduced into the field of encrypted traffic identification,and attempts to solve the problems of difficult traffic collection and high cost of dataset update in the field of encrypted traffic.By improving the calculation method of GAN loss function,this paper solves the problem that GAN model produces encrypted traffic,and ensures the diversity of generated samples without losing encrypted traffic information,so as to enhance and expand the dataset.The experimental results show that the proposed scheme performs well in different experimental scenarios,resolves the imbalance of encrypted traffic datasets,and the dataset generated by WGAN is superior to the sampled dataset in evaluation index,and also performs well compared with the existing schemes. |