| Objective: Diabetes, which is due to the imbalance in the body, resulting in a higher than normal amount of blood sugar range as the main manifestation of systemic disease, with its complications, high prevalence, has gradually become an important public issues influencing around the world. According to the international diabetes federation(IDF) by the end of 2013, the world the number of deaths from diabetes related disease for more than 5.1 million at the end of 2013, reaching 8.39% of annual deaths, the related medical costs up to $548 billion in the year, which accounted for 11% of the total cost. Among them, since entering the 21 st century, the diabetic population has been rising in China. By the end of 2013, China has become the number of cases the first country in the world with 98.4 million diabetic patients. Hence, how to use the method of statistical analysis of data of diabetes and effectively prevent the occurrence and development of the disease has the very great practical significance. In this paper, the research mainly has two purposes as following: firstly, the CLIQUE grid clustering algorithm is applied to the spatio-temporal data of patients with diabetes and based on the division(K- means algorithm) and comparing with algorithm based on density(DBSCAN algorithm); secondly, to further analyze the clustering result of each dimension(such as age, sex, living habit, etc), preventing the happening of the diabetes.Methods: Spatial clustering analysis in data mining is an important field in the research of data mining, it can be either find some of the information hidden and distributed in the database as a separate tool, can also be used as one of the pre-processing steps of other data miningalgorithms. Its main purpose is to divide the dataset into several clusters, minimizing the differences within these clusters and maximizing the difference between clusters. In the description of the similarity between the clusters, which is mainly according to the distance between the object to determine, the greater the distance, the smaller the similarity. The commonly used distance of Euclidean distance, Manhattan distance and Ming, distance test Grid clustering technology refers to the space into the data to determine the number of grid cell to build grid structure, and then conducted the clustering operation on the grid. Compared with the traditional clustering algorithm, grid clustering analysis has a higher efficiency based on grid clustering algorithm and could identify clusters with any arbitrary shape. Grid clustering analysis has been widely used in pattern recognition, data analysis and image processing, and other fields.Results: The results of clustering accuracy of the clustering time and the intrinsic extrinsic methods are obtained,Through the analysis of these results, the following conclusions can be drawn.Conclusions: By using three kinds of statistical clustering algorithm, the paper analyzed the 130 hospital in 10 years of diabetes patient data and compared the clustering results between the operation time and the accuracy of the results. Through the comparison, it could be found that both in time and accuracy, the best clustering algorithm is CLIQUE algorithm, followed by DBSCAN algorithm, the third one is K- means algorithm. |