Font Size: a A A

Application And Research Of Fuzzy Clustering Mining Method Base On Genetic Algorithm

Posted on:2012-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:2218330368478994Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With information technology and database technology developing at very fast speed, information processing has become a indispensable tool for people to acquire useful message. Data mining is a generic knowledge discovery technology, it is a process of findingmodel and the relationship of the data in a large amount of data by analytical tools. Clustering analytical is an important component of data mining technology. Data clustering mining technology is an emerging area which involves various areas.FCM (Fuzzy c-means) algorithm, as a kind of unsupervised learning methods, it is a research hotspot concerning about data clustering analytical technology. FCM is one of important algorithm in data clustering mining methods, it has the characteristics as simple, fast convergence and strong local searching power, etc. However, FCM is sensitive to initialization and tends to result in local minimum in iterations. Genetic Algorithm is a random searching global optimization algorithm. It is a computational model of the human evolution, with implicit parallelism and capacity of using effective global information. The combination of FCM algorithm and genetic algorithm will get a hybrid algorithm which benefits to solve clustering problem and make tremendous improvement in algorithm performance, the hybrid algorithm has good global and local search capability.This paper presents a hybrid Fuzzy c-means algorithm(IG-FCM) based on improved genetic algorithm.The algorithm use global search ability of genetic algorithm to optimize the initial cluster centers of clustering algorithm, and then carry out the FCM algorithm base on local optimization. IG-FCM is a Heuristic clustering algorithm, it orderly changes clustering class number, then automatically determine the optimal number of clustering class and the optimal clustering base on Evaluation of clustering validity function. Because of the traditional genetic algorithm has shortcomings like slow convergence, poor stability and low accuracy rate. This paper adopts the optimum preservation strategy in the selection operation to maintain the optimum individual in the process of genetic, and then copy the selected individual, then the optimum individual enters the next generation directly without participating in crossover and mutation operation. The copies and other individual will participate in crossover and mutation operation base on maximize degree of genetic variation. IG-FCM is greater improvement than classical clustering algorithms in performance, it could ensure the stability of genetic evolution and improve the convergence speed and accuracy.Based on analyzing the characteristics of IG-FCM algorithm and clustering algorithm that applied to intrusion detection system, according to the insufficiency of existing intrusion detection system detection performance, this paper proposed a hybrid weighted FCM clustering algorithm (IG-WFCM). The algorithm can be used to partition clustering of training data set for intrusion detection systems. The intrusion detection systems can detect the network data base on the results of clustering. IG-WFCM algorithm could make a difference between the continuous attributes and discrete attributes in the intrusion detection data Pretreatment Process, it using weighted hybrid distance metric methods to measure the similarity of data. In order to detect abnormal data and improve the detection rate of intrusion detection system, the algorithm using the method that set the width threshold of the normal data class.In this paper, we make the Intrusion Detection simulation experiment using KDD CUP 1999 data set base on IG-WFCM algorithm, results show that the average detection rate reached 80.1%, the average false positive rate remains at 1.605% or so. This results of experiment could fully demonstrate the feasibility and effectiveness of the IG-WFCM algorithm, and it could overcome the shortcomings like easily trapped into local minima, and low defect detection accuracy of FCM algorithm, and then could improve the performance and efficiency of the intrusion detection system to a certain extent.
Keywords/Search Tags:Data Mining, Intrusion Detection, Genetic algorithm, Fuzzy Clustering algorithm
PDF Full Text Request
Related items