Density peak clustering algorithm is a popular density based clustering algorithm at present,which is simple,efficient and novel.The decision uses a function to select a larger decision value as the cluster center and uses a one-step allocation strategy to complete the residual point allocation.Although the density peak clustering algorithm is efficient in clustering,its clustering results are sensitive to the input truncation distance parameter,the cluster center selection requires human intervention,and the problem of associated errors in data object allocation has not been effectively addressed.This paper introduces the idea of k-nearest neighbor and proposes corresponding improvement strategies.The specific research contents and results are as follows.A new density peak clustering algorithm based on shared k-nearest neighbor optimization is proposed to solve the problem that the truncation distance parameter of the density peak clustering algorithm is selected according to the empirical value and the cluster center will be selected more or less in the data set.Firstly,k-nearest neighbor algorithm is used to search the shared nearest neighbor of the data object and the concept of similarity is used to analyze the local density of the data object.Then,a cluster center inflection point discrimination method is proposed to achieve the purpose of adaptive selection of cluster centers.Finally,the remaining points are allocated through the concept of similarity,and the samples with high similarity are classified into one category.A new density peak clustering algorithm based on k-nearest neighbor and electrostatic force model is proposed to solve the problems of sensitive truncation distance parameters and possible errors allocation in remaining data objects of complex data sets in density peak clustering algorithm.According to the k-nearest neighbor theory,this algorithm can well reflect the characteristics of the local state of the data object,and improve the local density measurement method on the original data object;Then,the outliers in the data set are removed by defining the average local density and the average relative distance;Finally,on the basis of electrostatic force,the idea of local gravity is improved and the relevant rules for allocating residual points are proposed,which can effectively allocate the boundary points.Through theoretical analysis and experimental verification on artificial and real data sets,the clustering quality of the proposed algorithm has been effectively improved. |