Font Size: a A A

Research And Implementation Of Encrypted Traffic Classification Technology Based On Deep Learning

Posted on:2021-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:M D MaFull Text:PDF
GTID:2428330632462699Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,the network traffic generated by various network services and applications has shown explosive growth.At the same time,the security of information has also received more and more attention.Cryptography technology has been used to protect privacy and data transmission security.Because of its great application,the proportion of encrypted traffic in network transmission is increasing.However,while ensuring information security,it also brings severe challenges to abnormal traffic detection and network supervision.Many viruses,worms,and other malware use encryption and tunnel technology to bypass the detection and defense of security devices.Detection and identification have become hotspots in the industry.At present,the traditional methods based on ports and DPI cannot identify encrypted traffic.Although the method based on shallow machine learning has achieved a certain effect on the classification of encrypted traffic,there is a problem that the accuracy is insufficient due to the artificial extraction of features.In recent years,researchers have used deep learning algorithms to do a lot of exploratory research in the field of encrypted traffic,but these studies have only considered a certain type of encrypted traffic.For example,the traditional CNN model only considers the spatial characteristics of encrypted traffic,and the LSTM model considers only the temporal characteristics.Therefore,there is a problem that incomplete feature learning leads to low accuracy.In response to the above problems,this paper fully analyzes the spatial and temporal characteristics of encrypted traffic,and at the same time studies the impact of important features in spatial and temporal characteristics on classification,and proposes a model that combines CNN,LSTM models and attention mechanisms.In summary,the model proposed in this paper provides support for classification and identification of encrypted traffic,which has certain application scenarios and practical significance.The research content and conclusions of this article are:1.A complete experimental scheme for application classification tasks for encrypted traffic is proposed.These include data acquisition,data preprocessing,model training(including feature learning and classification),and testing phases.First,the experimental data set is constructed using the public data of ISCX VPN-NonVPN.Second,the original data set is preprocessed by using the USTC-TK2016 tool to obtain a data vector matrix and stored in the IDX file.Then the training set and In the test set,the training set data is input into the CNN-LSTM model based on the attention mechanism for feature learning;finally,the model is trained on the learned features,and the test set is used to verify the model effect.2.A classification model based on CNN,LSTM and attention mechanism algorithms is proposed.First,according to the differences in the spatial characteristics of the application traffic such as packet size,stream size,and number of data packets,the convolutional layer in the CNN model is used to learn the spatial characteristics of the encrypted traffic,extract local features of the data,and combine and abstract them to a high level.Dimensional features,and then introduce the attention mechanism,calculate the output features of the convolutional layer according to the weight of attention weights to obtain the weighted high-dimensional features,and then use the pooling layer to reduce the high-dimensional features to obtain the output results of the CNN layer.Investigate the differences in timing characteristics such as the average arrival time of encrypted traffic data packets,the time interval between data packet transmissions,and the sequence of traffic requests.Use the LSTM model to learn the timing characteristics of encrypted traffic,and input the CNN output results to the LSTM model for timing feature extraction.Consider the impact of the importance features in the time series features,such as the importance of the information in the header is higher than the small amount of data in the tail.Therefore,an attention mechanism is introduced to weight the importance of the output features of the LSTM model based on the attention probability to obtain encryption.Overall characteristics of traffic.Finally,the local and global features are fused,and the fusion result is input into the classifier for classification,and the output result is obtained based on the related classifier function.3.Designed and implemented a CNN-LSTM encrypted traffic classification system based on attention mechanism.The system includes online encrypted traffic collection,data preprocessing and test modules,which verify the availability and effectiveness of the method proposed in this paper.At the same time,it is compared with the proposed HST-R method,as well as the commonly used deep learning algorithms such as CNN algorithm,LSTM algorithm and CNN-LSTM algorithm.The results show that the classification accuracy of the proposed scheme is significantly improved and has certain application reference value.
Keywords/Search Tags:encrypted traffic classification, deep learning, lstm, cnn, attention mechanism
PDF Full Text Request
Related items