A Design Of Clustering Mining Algorithm Distinguishing The Multi-dimensional Based On Grid Density

Posted on:2015-02-01

Degree:Master

Type:Thesis

Country:China

Candidate:Q Wang

Full Text:PDF

GTID:2297330467482638

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Cluster analysis is an important part of data mining algorithms. It is an analysis exercise in data mining. The clustering algorithm is the core of the overall cluster analysis in that it determines the quality of the results. Currently, guaranteeing the stability and effectiveness of the algorithm, how to further improve clustering efficiency and how to reduce the cost and burden of the users have become very meaningful research topics.As traditional clustering algorithms have a high requirements in terms of computer hardware resources and requires a long time to complete the massive data clustering operation, this paper proposes a new clustering algorithm based on grid and density. While grid-based clustering saves time cost and high efficiency, it’s not very good at clustering quality. On the other hand, density clustering algorithms can use any clusters with different shapes, but it results in high cost in time due to its complexity in dealing with high-dimensional data space. Due to the complementary relationship between the two algorithms, a combination of strategies in grid density to distinguish the sample space can greatly improve the efficiency of clustering. This paper presents the following method:First, create grids, which is the initial data space meshing. Next, divide the sample space based on the threshold obtained by the grid density. The data in the grid cell is divided into the high and the low density regions. High density areas in the grid are arranged in accordance to the density to find the densest grid. The most current low density mesh region surrounding the densest grid is used to find the highest density cluster. The point of the highest density area is removed and the remaining high-density grids are sorted. The above steps are repeated until the final result creates a distinction in the space. Finally, calculate the center of gravity of each sub-class of cluster. The clusters near the center of gravity of space will be merged to form a new cluster center, followed by the combining the space until the cluster is equal to a given number of classes. This forms the final clustering results. Firstly, this paper from a theoretical description of the algorithm was proved reasonable and scientific in the design of the algorithm. Finally, based on Matlab to generate several groups of random data for an empirical analysis, this algorithm was compared with the classic K-means algorithm in set square deviation and time. The results proved this algorithm had a significant improvement in the computation time and verified the efficient of this algorithm.

Keywords/Search Tags:

Clustering Algorithm, Grid, Density, Distinguish

PDF Full Text Request

Related items

1	A Study On Hierarchical Clustering Of Micro-learning Units Based On Topic Feature Centers
2	Research On Spatiotemporal Behavior Of Campus Network Users Based On Clustering Algorithm
3	The Research On Clustering Of Mixed Data Stream Based On DPC Algorithm
4	Density Bias Sampling Algorithm Based On Big Data And Its Application Research
5	Study On Team Defense Strategy In RoboCup 2D Based On Density Clustering Algorithm
6	Research On Improved Hybrid Recommendation Algorithm Based On Clustering
7	Improved Chameloen Clustering Algorithm Based On K-medoids
8	Exploration Of Density-Based Clustering On High-Dimensional Data And Its Applications
9	Research On Key Technologies Of Campus Network Behavior Analysis Robot Based On Clustering Algorithm
10	Research On Mixed Data Clustering Algorithm Based On Information Entropy To Define Attribute Weights