Font Size: a A A

Research On Abnormal Network Traffic Detection Based On Unsupervised Learning

Posted on:2019-07-28Degree:MasterType:Thesis
Country:ChinaCandidate:J W WeiFull Text:PDF
GTID:2428330566494468Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The Internet of rapid progress has brought more and more convenience to our daily life.However,more and more information security problems have come into our view.In the face of the increasingly severe Internet security situation,our country has been issued relevant policies to promote the development of information security industry.But,the network traffic abnormal detection field,which is the first line of defense for information security industry,still faces the brunt of the threat.With the quickly development of machine learning in recent years,many researchers have used machine learning algorithms to solve problems in abnormal detection field and have obtained many experimental results.However,network flow data is generally primitive and massive in real network environment,it is very difficult to accurately mark network flow data.These problems are very challenging for traditional machine learning algorithms.Clustering algorithm is a representative of unsupervised learning.It can directly mine some hidden rules of data from unlabeled data,thus establishing a detection model.Based on the background,this paper mainly attempts to use the unsupervised learning algorithm in machine learning to study network abnormal traffic detection.Feature selection is an important step in data preprocessing.This paper proposes an unsupervised feature selection algorithm.It obtained the degree of importance of the category information by calculating the maximum information coefficient between each feature.Then feature clustering can be made according to the degree of similarity between the features.The experimental results show that the feature subset that this algorithm selected has obvious advantages over the original feature set with little impact on the accuracy.And it reduces the complexity of multi-dimensional features.In the background of the difficulty of obtaining categorical labels,clustering algorithms are more suitable for establishing abnormal detection models.The experiment found that density-based density peak clustering algorithm is better than other classical clusteringalgorithms in information security scenarios.However,the original density peak clustering algorithm has the disadvantages of inconvenience for parameter tuning and it cannot be applied to large-scale data.This paper proposes targeted improvements in order to attack these shortcomings.First,a more concise parameter set was proposed to facilitate work of parameter tuning.Then using outlier detection algorithm based on Gaussian probability density we can automatically select algorithm parameters,which reduced the difficulty of parameter tuning to some extent.The second point is to propose a sampling method for large-scale data,of which we can verify the validity on the actual data set.
Keywords/Search Tags:information security, abnormal flow detection, unsupervised feature selection, density peak clustering
PDF Full Text Request
Related items