Font Size: a A A

Study On Parameter-free Peak Clustering Algorithm

Posted on:2020-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:L B JinFull Text:PDF
GTID:2428330599976461Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data mining is a process that uses advanced and accurate techniques to mine and generate new knowledge from large,complex and redundant data.The purpose is to find potential associations between data and provide researchers with a favorable scientific guidance.Clustering algorithm is an important unsupervised algorithm in the field of data mining.It aims to find the inherent distribution structure of data for further data analysis.Currently,it has a wide range of applications in many research fields,including pattern recognition,information retrieval,neural networks,and image processing.In this paper,the clustering algorithm is deeply studied,and three new parameter-free peak clustering algorithms are proposed:1.Laplacian centrality peaks clustering based on potential entropy(PELC)is proposed.In view of the fact that most of the current clustering algorithms are sensitive to parameters,cannot automatically complete clustering,and cannot remove noise points,this paper uses the concept of potential energy entropy to automatically extract the parameters required by the algorithm from the original data,and combines with the clustering principle of the DBSCAN framework to automatically complete the clustering.2.Curvature-based Laplacian centrality peaks clustering(LCPC)is proposed.Aiming at the problem that the traditional clustering algorithms cannot effectively determine the number of clusters,the proposed algorithm determines the number of clusters k by analyzing and using the curvature of an evaluation graph.The evaluation graph is a graph of the variance within the cluster and the number of clusters in the clustering process.3.A k-means improved algorithm(LCK)based on node centrality and curvature is proposed.Aiming at the problems that k-means randomly selects the initial clustering centers and cannot effectively determine the number of clusters,in this paper,the Laplace centrality algorithm is used to evaluate the importance of nodes.The curvature-based method effectively obtains the number k of clusters,and further obtains k initial cluster centers according to the importance of nodes to complete clustering.In this paper,the three parameter-free clustering algorithms PELC,LCPC and LCK proposed above are described in detail,including the specific steps of the algorithm implementation,the experimental data set and evaluation index,and the clustering effect analysis.The data sets used in the experiment include comprehensive data sets,real data sets,high-dimensional data sets,and so on.The parameter-free peaks clustering algorithm solve the problems that the traditiona l clustering algorithms are sensitive to parameters to a certain extent,realize true parameter-free,and have certain advantages in clustering effect.
Keywords/Search Tags:Laplacian centrality, density clustering, peak clustering, potential entropy, number of clusters, curvature
PDF Full Text Request
Related items