Font Size: a A A

The Research And Application On The Population Clustering Algorithm For Data Mining

Posted on:2007-02-24Degree:MasterType:Thesis
Country:ChinaCandidate:X CaiFull Text:PDF
GTID:2178360242961607Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As population resource is the most important one for a country, reinforcing population management with information tools is very important for other projects in our country with the largest population in the world.In more detail, The"shack up in"population is related with the public security management. Clustering analysis is an important research problem in the domain of data mining. The goal of clustering is to partition data set into such clusters that intra-cluster data are similar and inter-cluster data are dissimilar without any prior knowledge, which is very different from data classification. Clustering analysis on the"shack up in"population and discovering different"shack up in"groups are helpful to the modification and establishment of the public security management.Many clustering algorithms are presented which include distance-based clustering algorithms, density-based clustering algorithms and grid-based clustering algorithms. This paper mainly researches on distance-based clustering algorithms whose representative is k-medoids algorithm and density-grid-based clustering algorithms whose representative is CLIQUE algorithm. In k-medoids algorithm the times of iterative loop is increasing square as the number of data increasing. Therefore The article discusses a difference matrix of the medoids to improve the speed of clustering. In CLIQUE algorithm the density valve has only one value in despite of the dimension increasing,ξ(the partition number at every dimension) only one too. The optimization methods in the article can make the data overlap maximized: every dimension has its ownξand the density valve decreasing as the dimension increasing.In order to test the performance of clustering algorithms, we design and realize a clustering experimental program, which carries out data connection, clustering and two-dimensional data visualization. Experimental results show that the improved k-medoids algorithm is fast but easy to get local optimization. And the optimized CLIQUE algorithm gets larger scale. At last, a practical problem is solved by using the idea of ameliorating k-medoids algorithm.
Keywords/Search Tags:data mining, clustering analysis, "shack up in"population, k-medoids algorithm, difference matrix, CLIQUE algorithm
PDF Full Text Request
Related items