Font Size: a A A

HTTPS Tunnel Traffic Detection Based On Fingerprint And Statistical Features

Posted on:2020-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:W X YinFull Text:PDF
GTID:2428330602951383Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous deepening of the informatization process,a great number of devices and a huge amount of data have been connected to the Internet,which not only brings great convenience to social life but also poses severe security challenges.On the one hand,various types of cyberattacks are emerging,with data leakage incidents being exposed from time to time;on the other hand,the widespread use of traffic obfuscation tools has led to the failure of some network traffic censorship mechanisms.Although the cyberattack tools and the traffic obfuscation tools are used for different purposes,the technologies they used against detection share many similarities.With the widespread adoption of the encryption technology,traditional methods based on deep packet inspection seem no longer effective.At present,the common challenge faced by the cyber security defenders and the network traffic censors is how to accurately detect cyberattacks or obfuscated traffic from ever growing network traffic without harming the legit business.At present,the main method used to circumvent traffic detection is obfuscation,which disguises network traffic as benign traffic and includes three main approaches: randomization,mimicry and tunneling.If certain obfuscated traffic shares enough similarities with the benign traffic,the false alarm rate of the detection system will increase dramatically,forcing the system to give up blockading this obfuscated traffic before causing too much damage to the normal services.To circumvent detection,some obfuscation tools choose to use the widely adopted HTTPS network protocol for masquerading.For example,Meterpreter is an advanced dynamic attacking tool in the Metasploit penetration testing framework.To circumvent detection,it now has many variants including one disguised as HTTPS protocol,yet so far few attempts have been made to detect its traffic in public information;For another example,Shadowsocks is one of the most ubiquitous censorship circumvention tools.To keep a low profile,the community developed this simple-obfs obfuscation plugin to disguise its traffic as HTTPS,however users have been skeptical about its effectiveness.Therefore,this paper studies the detection of these two HTTPS tunneling tools from a perspective of obfuscated network traffic identification.(1)After analyzing studies of the two main methods on obfuscated traffic detection and related fields,which are deep packet inspection or machine learning detection,a machine learning based approach combining the fingerprint features and statistical features of network traffic is proposed.On the one hand,the plaintext fingerprint features in the HTTPS handshake records,such as the cipher suites and extensions supported by the client,are converted into numerical features;on the other hand,the secondary features are extracted based on the statistical features in the HTTPS application records.Finally,the fingerprint features,the statistical features and the secondary features are fed into the machine learning classification algorithm to train the detecting model;(2)After evaluating multiple open datasets and analyzing the working principles and traffic characteristics of Meterpreter HTTPS and Shadowsocks Obfs,two tunneling traffic collection and one normal HTTPS traffic collection systems are built,and a total of 24,571 normal HTTPS traffic samples,17,812 Meterpreter HTTPS tunnel traffic samples and 7,692 Shadowsocks Obfs tunnel traffic samples are collected to serve the tunneling traffic detecting system;(3)Design and implement a HTTPS tunneling traffic detecting system based on fingerprint and statistical features,which automatically carries out tasks like traffic sample analysis,features extraction,classification model training and sample type assessment.Based on this system and collected dataset,the performance of 8 classification algorithms covering multiple categories are tested.After assessing the precision,recall and F1-score of each algorithm,find that random forest has the best overall performance,the precision and recall of which are both over 99%.The feature importance rankings are obtained from the random forest algorithm used in the system to detect traffic of these two tunnels.It is found that the influences of fingerprint features and statistical features are comparable,and the features which have greater importance when detecting traffic of these two tunnels are largely the same,which indicates that this system should be scalable.
Keywords/Search Tags:HTTPS Tunneling Traffic, Obfuscation Detection, Deep Packet Inspection, Machine Learning, Random Forest, Meterpreter HTTPS, Shadowsocks Obfs
PDF Full Text Request
Related items