Font Size: a A A

Research On Traffic Identification Based On Shadowsocks

Posted on:2020-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:H S HeFull Text:PDF
GTID:2416330596468995Subject:Public Security Technology
Abstract/Summary:PDF Full Text Request
As an emerging anonymous communication tool,Shadowsocks is gradually becoming popular in China because of its excellent communication efficiency and stability.At present,there are few studies on Shadowsocks in China,and there is no systematic research on the operation mechanism of Shadowsocks.This paper has carried out the following research on the traffic identification of Shadowsocks:1 ? After observing and analyzing the traffic sample data of a large number of Shadowsocks,this paper is the first to systematically describe and summarize the operating mechanism and communication principle of Shadowsocks.2?For the traditional traffic statistics feature set,there may be high similarity dependence redundancy features,which leads to the problem of greatly reducing the performance of the classifier in the recognition process.This paper proposes a feature extraction model based on principal component analysis-Pearson correlation coefficient..This model can reduce the existing traditional traffic statistics features,remove the features with low correlation with the samples,and finally select the feature set with strong correlation and low redundancy suitable for identifying Shadowsocks traffic.At the same time,the feature set is identified and verified by using three methods: Random Forest,Support Vector Machine(SVM)and eXtreme Gradient Boosting(XGBoost).The experimental results show that the feature set can ensure recognition accuracy while greatly improving recognition efficiency.3?For the problem that the classifier has low recognition rate on the unbalanced sample set with small Shadowsocks traffic proportion,this paper proposes a recognition model based on network flow multiple filtering.After observing and analyzing a large number of Shadowsocks traffic samples,this paper extracts its character entropy feature and proposes a filtering method based on character entropy.According to the extracted sequence length characteristics of the message,this paper proposes a filtering method based on the message length sequence to further filter.And based on the results of the first two steps of filtering,this paper proposes a filtering method based on packet length entropy.Finally,this paper incorporates three filtering steps to filter the mixed stream.At the same time,the reduceddimensional feature set and XGBoost algorithm are combined to construct the classification model.The model was finally validated by multiple sets of different real data sets.By filtering Shadowsocks traffic and non-Shadowsocks traffic in the mixed stream in advance,the accuracy of recognition in the face of unbalanced sample sets is effectively improved,and the efficiency of recognition is greatly improved.
Keywords/Search Tags:Traffic identification, Feature extraction, Shadowsocks, XGBoost, Entropy
PDF Full Text Request
Related items