Research On Clustering Algorithm Based On Grid Point Density Estimation

Posted on:2020-11-24

Degree:Master

Type:Thesis

Country:China

Candidate:L Wang

Full Text:PDF

GTID:2428330596987331

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The research of machine learning algorithms is a significant branch in the field of artificial intelligence,which involves the cross-fusion of many disciplines.The object of machine learning algorithms is how to simulate human behavior to learn new knowledge so that it can update knowledge structure and improve the performance of its own algorithms.In recent years,the research in machine learning has made great progress and various machine learning algorithms have also been proposed.Machine learning algorithms are usually divided into three categories: supervised learning algorithms,unsupervised learning algorithms,and semi-supervised learning algorithms.The clustering is one of the most representative unsupervised machine learning algorithms.According to certain characteristics,similar data points in the data set are divided into the same cluster and the non-similar data points are divided into different cluster by the algorithm.Although a variety of clustering algorithms have been proposed,most traditional clustering methods can only be applied to the clustering of spherical data and the clustering results may be affected by parameter setting and initialization.In addition,when the number and dimensions of data points become very large,the efficiency of the clustering algorithm will be limited by time complexity and spatial complexity.Therefore,a fast and robust grid-based clustering method is proposed in this paper,which can identify clusters with arbitrary shapes.The algorithm can also be used to cope with large data sets.In the improved method,firstly,the number of divided grid can be automatically determined by using a given formula.Then,the algorithm calculates the densities of the grid nodes instead of the traditional grid densities.Finally,the classical breadth-first search algorithm is used to perform clustering operations based on the densities of the grid nodes.Experiments on multiple artificial datasets and real datasets show that this method is more efficient and effective than traditional clustering methods.In addition,the values of clustering evaluation indexes usually need to be calculated to evaluate the clustering results.The traditional point-to-point comparison method is less efficient to get the evaluation indexes of big datasets.In this paper,the method of calculating the clustering result evaluation indexes by using the confusion matrix is given.The experimental result shows that the efficiency of obtaining the value of evaluation index can be obviously improved by this method.

Keywords/Search Tags:

grid-based clustering, grid nodes, breadth-first search, clusters with arbitrary shapes, clustering result evaluation

PDF Full Text Request

Related items

1	The Research Of Clustering Algorithms For Mining Arbitrary Shaped Clusters
2	Research On Split-and-merge Based Clustering Algorithm
3	The Research On Arbitrary Shape Cluster Algorithm Based On Hierarchy And Density
4	Research And Implementation Of Clustering Feedback Grid Resource Distribution Search Engine
5	Study On Grid-based Clustering Algorithms
6	Grid Clustering Algorithm
7	Research On Data Stram Clustering Algorithm Based On Similarity And Grid Partition Optimization
8	Research On K-medoids Clustering Algorithm Based On Breadth-first Search
9	Research On An Effective Self Adapted Grid-Density Based Clustering
10	Study On Grid-Based Clustering Algorithms