| The clustering method based on partitioning is mainly included K-Means and K-Medoids; the other methods are the mutation of these two methods. However. K-Means algorithm would have fluctuant results lieing on the initial K center samples. That's because of the random selecting center samples; the iterative process might end at a local extremum which may not optimal value. The traditional K-Medoids can also have good local searching results although needed longer time.In this paper, in order to solve the fluctuation of the results, the author presented thinking about selecting initial center of clusters. Firstly, we provided a method to explore the distribution of data, to analyze distribution of the data. Then, we create a clustering tree according to the data distribution. When clustering, we fixed on a proper threshold with binary searching according to the user inputing parameter K, in order to create K+X partition. Then we partitioned the smallest X areas to the other K clusters. We should optimize the K clusters with gradient center method until the clusters stopped movement. Last, computer the center of every K clusters, and select K point which has the shortest distance to the K center as the initial cluster center. Experiments presented the method can avoid results fluctuation, and had higher value of accurate rate and recall rate than traditional method.In order to acquire better results in a shorter time, we presented an improvement. When replacing original center object with a better non center object, we should find out an object which can maximize the objective function. Then, we reassign the rest objects. Thus, the method can increase the objective value faster. Experiments presented that the improvement had a faster convergent speed and a higher F- measure.At last, we applied the improved algorithm to the Dalian Police Office Network Campaign System, in order to analyze and cluster the criminals. |