| With the vigorous development of mobile networks,the number of mobile applications has increased sharply,and mobile Internet traffic is getting a high percentage of network traffic.Mobile application traffic identification is becoming more and more important for network management and network security.At present,mobile application traffic identification based on deep package inspection is the mainstream technology,and the accuracy is relatively high.However,this technology is based on a pre-collected mobile application labeled traffic,and the recognition result depends on the accuracy of the signature and application coverage.Facing the massive mobile application market and thousands of applications updating per month,building a feature database is a challenging task,which severely restricts the practical effects of mobile application traffic identification technology based on deep packet inspection.Therefore,this paper studies the automatic extraction technology of mobile applications traffic features.The main work is as follows:(1)A mobile application traffic feature extraction method based on frequent itemsets is proposed,which extracts the left value of key-value pairs from the URL text of the mobile application HTTP payload.The method converts the sample feature extraction into a classic frequent itemset problem,and uses the improved FP-Growth algorithm to realize automatic application feature extraction.The experimental results show that the average recall,average accuracy and average FPR of this method are better than the comparison method with less samples of 100 labeled samples each application.(2)This paper studies the Android third-party services,and sorts out the types and functions of third-party services,as well as the associated information with mobile applications.This paper also proposes methods for extracting the association between third-party services and Android applications.The association is used for seed features in a feature expansion method based on the spatio-temporal correlation.This method is for mobile traffic without labels such as real network traffic.This method reduces the process of obtaining labeled mobile application traffic and improves the efficiency of mobile application traffic feature extraction.(3)This paper designs and realizes a mobile application traffic feature automatic expansion system based on spatio-temporal correlation.Using third-party service identifiers and existing application features as application identification seed features,this system identifies application traffic through spatio-temporal correlation of network traffic.The feature extraction method of frequent itemsets is used for extracting new application features which are added to the seed features.This system realizes the automatic expansion of mobile application traffic features.The experimental results show that the recall of this method is improved compared with the chapter 3 while there is no significant difference between the accuracy and the FPR. |