| With the improvement of network infrastructure and the development of Internet of Things technology,the network has become an indispensable element in daily life,which makes the connection between things more closely.The intricate connections form a huge and complex network.In the era of big data,the explosive growth of network size and data volume has evolved into network big data,which is affecting and changing people's living habits and work patterns.People attach great importance to network security issue.Using network big data to analyze the network situation has become a hotspot in the field of network security,grasping the trend of network security from a macro perspective through analysis based on multiple dimensions.Network big data is which can be acquired on the Internet through the interaction of multiple worlds in cyberspace.These data have the characteristics of wide availability,multi-source heterogeneity,interactivity,burstiness and high noise.It not only contains rich unstructured data and complex associated knowledge,but also has strong timeliness and exists in the form of streaming data that is generated dynamically and quickly.In many network security situation awareness technologies,the main method is to identify the existence of the network data and its possible impact by analyzing the data records in the network.However,under the background of big data,the existing network security situation awareness models have some disadvantages,such as high cost of resources,low precision of analysis results,low accuracy,low processing efficiency,high requirements for network equipment configuration,and so on.Therefore,the existing models cannot be extended to real-time and large-scale applications.In order to overcome the shortcomings of existing network security situation awareness technologies,this paper combines the characteristics of network big data to construct a data analysis model based on distributed technology.The model is mainly divided into three parts: network security situation detection,network security situation understanding and network security situation projection.On this basis,four models of network security situation awareness based on big data are proposed.Firstly,the network security situation awareness model based on neural networks.In the model,data simplification and cleaning are performed according to the characteristics of data records.In order to solve the problems of multi-source heterogeneity and high noise,the three-layer back propagation neural network is used as the core of the model,and the error feedback strategy is used to improve the precision and accuracy.Secondly,the network security situation awareness model based on random forest.The model performs data dimensionality reduction on the data through data feature analysis to highlight the characteristics of data records and reduce invalid data.This reduces the model's resource overhead and reduces the model's dependence on network equipment configuration.Using the random forest algorithm as the core of the model can effectively distinguish various abnormal behaviors in the network.Thirdly,the network security situation awareness model based on star structure.In the model,the correlation between the data records and the independence of the data records are solved through the optimization of the association rules mining algorithm.Naive Bayes algorithm is the core of the model.The trend of the entire network environment is analyzed efficiently through the fusion of local prediction results.Fourthly,the adaptive network security situation awareness model.The model uses data features to dynamically generate a network situational anomaly library,which effectively solves the problem of analyzing and processing network data streams generated quickly and dynamically.Taking the dynamic time warping algorithm as the core of the model and combining the characteristics of off-line learning and online learning to analyze and process the streaming network big data.The model can effectively deal with the extensive and sudden problems of real-time data flow.These four models are solutions to the big data problem.These models have complex structures and are integrated on the distributed platform.These big data processing model based on distributed technologies can effectively solve resource consumption,analysis accuracy and real-time performance,etc.Those problems that traditional models cannot solve.For the application of big data,the traditional memory-based machine learning algorithm is no longer applicable.Parallelism is the mainstream method to deal with big data.The basic idea of the network security situation awareness model proposed in this paper is to realize different functions of the model by using different parallelized machine learning algorithms.First,the network big data is cleaned and preprocessed,and then the network situation is understood and analyzed.And the perceived results of the network security situation are obtained according to the analysis results.In this paper,these four models are applied to large-scale data sets based on distributed platform.The results show that the proposed network security situation awareness model has a good effect. |