Font Size: a A A

Research On Anomaly Detection Based On Kernel Function Stream Clustering

Posted on:2023-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:R H HuangFull Text:PDF
GTID:2568307151479404Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of communication technology,Industrial Internet of Things is widely used in energy,manufacturing,medical and other fields,resulting in large-scale,high-speed,high-dimensional data stream.The security events of the Industrial Internet of Things are frequent and the problems are increasingly prominent.Data stream anomaly detection is helpful to the development of the security technology of the Industrial Internet of Things.It is a hot research issue at present.However,there are many difficulties in dealing with high-dimensional data streams anomaly detection.Therefore,it is very important to study a fast and feasible anomaly detection method for high-dimensional data streams.Stream clustering is one of the most effective methods to solve high-dimensional data stream anomaly detection.Because of its dynamic incremental characteristics and the accuracy of clustering results,it has become a research hotspot for high-dimensional data stream anomaly detection.There are many studies on stream clustering for data stream anomaly detection,but there are still three aspects:First,sketch data structures in many stream clustering cannot avoid the interference of anomalies in high-dimensional data streams.Second,many algorithms cannot process nonlinear high-dimensional data streams quickly.Third,there is a lack of stable and efficient solutions for hidden anomalies in data streams.Therefore,how to detect highdimensional data stream anomalies efficiently and quickly has become a huge challenge.In order to solve the above problems,this thesis analyses the advantages and disadvantages of various data stream anomaly detection algorithms.On the basis of Euler kernel function,a projected micro cluster structure based on shared neighbor density is proposed.On the basis of the Gauss kernel density estimator,fast anomaly detection is performed by combining stacked habituation autoencoder.Based on the distributed computing environment,a hidden anomaly detection and analysis system is designed and implemented.Finally,the superiority of this method is proved by sufficient comparative experiments.The main contributions of this thesis are as follows:(1)For the problem of anomalies interference in high-dimensional data stream,this thesis combines Euler kernel function and shared neighbor distance to generate shared neighbor density,which is used to construct a projected micro cluster structure,which can effectively distinguish different categories of micro clusters,reduce the sensitivity of anomaly points,and improve the purity of stream clustering.(2)In order to improve the efficiency of anomaly detection for non-linear highdimensional data streams,this thesis uses stacked habituation autoencoder to reduce the dimension of data,effectively extracts the non-linear relationship between data streams,uses the distribution of data streams,and combines micro cluster and kernel density estimator to build an anomaly detection factor for fast anomaly detection of data streams.(3)For the characteristics of hidden anomaly sparse and unstable,this thesis takes the log stream as the starting point,uses different log rules to vectorize the log stream under the idea of distributed high availability,and uses the adaptive anomaly detection algorithm for hidden anomaly detection,which achieves high detection accuracy and system stability.
Keywords/Search Tags:Anomaly detection, Stream clustering, Kernel function, Kernel density estimation, Autoencoder, Habituation
PDF Full Text Request
Related items