Font Size: a A A

Classification Of Encrypted Traffic Application Service Based On Spark Platform

Posted on:2018-06-14Degree:MasterType:Thesis
Country:ChinaCandidate:R S DingFull Text:PDF
GTID:2348330518997007Subject:Information security
Abstract/Summary:PDF Full Text Request
With rapid development of the Internet, the scale of the network is unprecedented huge, meanwhile the network applications are more complex and diverse. The increasingly serious network security situation makes people pay more attention to the protection of network information. As one of the most popular encryption protocol, SSL/TLS protocol can protect users’ privacy as well as illegal attacks. However,the existing research about SSL/TLS encrypted traffic rarely involves application services on the application layer. The application of SSL/TLS is complex and diversified, and it is helpful and significant for monitoring and analysis of the encrypted traffic to classify the application services.Based on the existing SSL/TLS traffic research, this thesis proposes a method to classify the application service of encrypted traffic, and the classification system is implemented based on Spark platform. First of all, we propose a method to mark the application service in SSL/TLS encrypted traffic by extracting the common name, certificate chain and sessionID from handshake process, with additional extension field of"Client Hello" . Afterwards labeling the encrypted traffic according to the application service mapping relationship. Secondly, in view of the position of SSL/TLS application service in the application layer of the network protocol stack, we propose a simple C4.5 decision tree algorithm to filter the HTTPS traffic, then use the Random Forest to classify the application service on HTTPS, and optimize the sample feature value with a view to the environment of network. The classification accuracy of the method is 96% in our experiment. Finally,in the light of our method, the traffic classification system on Spark platform is implemented, which combines Spark Streaming with Spark MLlib, managing messages with Kafka. This system consists of several modules, and has achieved good results with an average accuracy of over 90%.
Keywords/Search Tags:SSL/TLS, application service, machine learning, Random Forest, Spark platform
PDF Full Text Request
Related items