Font Size: a A A

Studies On Multiscale Integrated Representation Of Network Traffic And Intelligent Analysis Methods

Posted on:2023-07-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:H W BaiFull Text:PDF
GTID:1528307061974039Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the increasing richness of network protocol types and the rapid increase in the scale of network traffic,cyberspace governance tasks such as application type identification,undesirable content monitoring,network threat detection and abnormal behavior analysis based on traffic analysis are facing serious challenges.Traditional protocol parsing and traffic analysis based on protocol specification and inverse analysis is difficult to meet its automation,accuracy and adaptability requirements.To address these challenges,intelligent techniques for analyzing network traffic based on feature engineering and deep learning have been developed.However,these techniques still suffer from the constraints of poor feature and model representation capabilities.They cannot adapt to the continuous emergence of disguised and deformed protocols,discovering the various types of latent threatening attack traffic,and penetrating the communication behavior hidden in complex bearer traffic.It is urgent necessary to develop the new network traffic representation theory and intelligent traffic analysis methods.To this end,this dissertation proposes to construct a multiscale integrated representation model for network traffic.And based on this,several traffic intelligence analysis methods are proposed applied to automated network protocol parsing,malicious traffic detection,complex bearer service identification and component inference.The main research results obtained in the dissertation are as follows:(1)To address the lack of traffic representation theory in the field of intelligent traffic analysis,a integrated representation model based on multiscale traffic analysis is constructed.The model portrays the syntactic,semantic,and behavioral information of network protocols at three scales: field,message,and session.And standard protocol field representations,message syntax representations,and session behavior representations of network protocols are established,respectively.Protocol field representation implements a standard description of a protocol field by defining the set of basic elements of the field and formalizing its type,name,position,length,value,and other attributes.Message syntax representation deconstructs the hierarchical,juxtaposition,mapping and other relationships between the basic elements of the protocol to form a structured syntax representation.Conversational behavior representations construct sets of features of conversational interaction behavior through formal language.In addition,the dissertation uses a graph model to describe the fusion of representations at three scales: field,message,and session.The multiscale integrated representation of network traffic has the ability to deeply embed network protocol knowledge in the representation,as well as a consistent representation form that can be easily processed by machine learning frameworks,laying the foundation for various subsequent intelligent traffic analysis tasks.(2)To address the difficulty of existing element parsing methods to cope with the continuous emergence of disguise and deformation protocols,a message intelligent parsing method based on self-enhanced learning of field representations is proposed.Based on the proposed multiscale integrated representation framework,the method first maps the network traffic into a sequence of basic elements.And then a hierarchical self-reinforcing deep neural network model is designed.The model includes a field function inference module and a field parameter inference module.The function inference module is used for inferring attributes such as field types and names,and the parameter inference module enhances the parameter inference capability by receiving partial input from the function inference module.The experimental results of intelligent parsing of multi-type HTTP instance traffic collected on a live network show that the proposed method can effectively output full element parsing information and parse it into standard field sequence representations.(3)To address the problem that existing deep learning models have difficulty in accurately detecting various types of latent threatening attack traffic,the dissertation proposes an intelligent identification method for malware traffic based on message syntax representation.The approach is based on the proposed multiscale integrated representation framework,which characterizes messages as a structured sequence of basic elements carrying semantics.It then extracts the local associative semantic features of the sequences using a multi-size convolutional kernel and extracts their temporal features by a bidirectional threshold cycle model.The method uses the spatial and temporal correlation of message segments to complete the mining of their behavioral semantics and to realize the identification feature learning of malicious traffic and normal traffic.Comparative experiments based on publicly available traffic datasets show that the method can effectively detect malicious traffic and outperforms many existing typical methods based on deep learning.(4)To address the problem that existing methods are difficult to refinedly penetrate the communication behavior hidden in complex bearer traffic,a complex bearer service component inference method based on session behavior representation is proposed.Based on the proposed multiscale integrated representation framework,the method first maps traffic into behavioral semantic representations in four dimensions: length,time,direction and payload.These behavioral representations are then combined with classifiers such as decision trees and random forests to build a machine learning-based model for bearing service type recognition.Based on this,the dissertation screens effective behavioral characterization elements by principal component analysis for the complex situation where multiple services are carried in the same tunnel.Then constructs a mapping relationship between these elements and the bearing service component parameters through a deep neural network regression model to finally infer the component information of the tunnel bearing service.The experimental results with the traffic data collected from the real network show that the proposed method can accurately estimate the bearer service components and can effectively improve the fine-grained sensing capability of complex bearer traffic.
Keywords/Search Tags:Multiscale modeling, Integrated representation, Intelligent traffic analysis, Network traffic parsing, Malicious traffic detection, Complex bearer service analysis
PDF Full Text Request
Related items