Font Size: a A A

Research On Key Technologies Of Mobile Application Traffic Identification

Posted on:2021-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q M RaoFull Text:PDF
GTID:2428330629951036Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Nowadays,mobile app traffic has become a major component of network traffic.Mobile applications are booming,covering almost all of the user's social,shopping,travel,medical,entertainment and other life behaviors,which make mobile application traffic contain abundant user information.Obtaining this information is of great significance to network surveillance.However,the prerequisite for obtaining this information is to achieve accurate traffic identification,that is,to identify the source of the application of the traffic.Machine learning is superior in the classification field.Therefore,the machine learning-based traffic identification method has great potential,and related research and work are more,but it can not meet the traffic identification requirements of large-scale applications.In view of this,this paper starts from the communication method of mobile application and server and the traffic identification method based on Deep Packet Inspection(DPI),and deeply studies the traffic identification method suitable for large-scale application scenarios.This paper mainly discusses:(1)Traffic identification algorithm based on static DNS database: establish a static DNS database by extracting the IP-domain name in massive DNS packets,and propose a method based on crawler technology and dictionary tree matching to establish the correspondence between domain name and mobile application.This establishes an IP-domain name-mobile application mapping relationship,thereby achieving the purpose of effectively identifying application traffic according to the server IP.(2)Mobile audio and video application traffic identification algorithm: By studying the principle of accessing CDN server by audio and video applications,and the proportion of traffic generated by audio and video,the audio and video application traffic identification algorithm for non-encrypted flows and encrypted flows synchronization identification is proposed.The feature string corresponding to the mobile application is extracted from the Host field、the User-Agent field in the HTTP packet,and the ServerName field in the plaintext extension area of the HTTPs protocol,to synchronously identify the non-encrypted flows and the encrypted flows.(3)Traffic identification algorithm based on application fingerprint: A large-scale automatic extraction algorithm for application fingerprints is proposed.One or more feature strings are extracted for each application from the packets generated by the mobile application,and a scoring model for filtering feature strings is proposed,and develop feature streamlining strategies.Combining the filtered one or more feature strings into an application fingerprint that identifies the mobile application traffic,and then using the HyperScan algorithm to effectively improve the string matching efficiency to meet the traffic identification needs of large-scale application scenarios.The method proposed in this paper has greatly improved the effect of identifying the mobile traffic,with an application coverage of 70%,coverage of flow up to 87%,and coverage of byte up to 98%.
Keywords/Search Tags:Deep Packet Inspection, Domain Name System, Audio and Video Applications, Feature String, Application Fingerprint
PDF Full Text Request
Related items