| In recent years,with the development of information technology and the production of a large amount of information data,data mining analysis has become particularly important.As an important part of data mining,outlier detection technology is mainly aimed at finding abnormal data that clearly deviates from or does not meet the conditions in a large amount of data.At present,outlier detection technology has been widely used in telecommunication fraud detection,bank card fraud detection,and air quality index prediction.Based on the k-means algorithm,the outlier detection is a classic outlier detection algorithm.Because it is sensitive to the initial clustering center so as to the final result is easy to obtain a local optimal solution.In order to solve this problem,this paper uses the cuckoo search algorithm to improve the outlier detection of the traditional k-means algorithm,and proposes an outlier detection method based on the k-means algorithm of the improved cuckoo search.First of all,in view of the low search accuracy and slow convergence speed of the cuckoo search algorithm,the discovery probability and step length of the cuckoo algorithm are adaptively improved.It can be seen from the experimental simulation results that the improved cuckoo search algorithm no matter from the convergence both the speed and the fitness value are better than the original cuckoo search algorithm.Secondly,using the random algorithm convergence criterion and Markov chain model,it is proved that the improved cuckoo algorithm converges to the global optimum with probability 1.Then,in order to solve the shortcomings of the traditional k-means algorithm,the improved cuckoo search algorithm is combined with the k-means outlier detection algorithm,and an outlier detection method based on the k-means algorithm of improved cuckoo search is proposed.The results of simulation experiments on the algorithm on the UCI data set show that the algorithm in this thesis not only has obvious advantages in accuracy,but also has improved convergence speed on the three data sets,which effectively suppresses the k-means algorithm.The outlier detection capacity is sensitive to the initial clustering center,and the running time is shortened at the same time.Finally,the algorithm proposed in this paper is applied to network intrusion detection to verify its feasibility.The KDDCUP99 data set is used as the detection data,the attack type is analyzed,and the data is preprocessed.The experimental results show that the outlier detection algorithm proposed in this thesis has a good detection ability in intrusion detection. |