| Cluster analysis assign data objects into different clusters according to their similarity such that data objects in the same clusters are similar to each other while data objects in the different clusters have a low similarity.With the development of cluster analysis,more and more clustering algorithms have been proposed.Compared with other types of clustering algorithms,density-based clustering algorithms can not only identify clusters of arbitrary shapes,process and recognition noisy data,but also automatically discover the number of clusters.Cluster boundary detection-based clustering algorithms use the obtained class boundary information for subsequent clustering processes,which can effectively deal different datasets and improve the accuracy of clustering algorithms.However,there are still some problems in these two types of algorithms:(1)the definition of density is not accurate enough,and the importance of neighbors is not considered when estimating local density.(2)they rely on global threshold,which may lead these algorithms not be able to accurately identify the boundary between adjacent clusters with variable-density clusters and be sensitive to parameters.(3)the clustering process has poor tolerance and the accuracy of clusters center identification is not high.To deal these problems,this article has proposed improvements to based-density and cluster boundary detection-based clustering algorithms,providing new idea for the research of such algorithm.The clustering algorithm based on shared nearest neighbor is proposed,which is divided into two stages: local density estimation and clustering based on density.Firstly,the weight coefficient is designed based on the idea of shared nearest neighbor to assign different values to K neighbors of data object.Secondly,the density estimation function of the data object is redefined,which avoid the dependence on global threshold and improve the accuracy of density estimation.Finally,according to the local density of the data objects,a new clustering strategy is proposed.Through setting different values and ensuring the data objects are allocated when the conditions are satisfied,which let the proposed algorithm is more reasonable in clustering process and is more accurate in automatically discovering the cluster center.The clustering algorithm combining density estimation and cluster boundary detection is proposed,which is mainly implemented by three steps: cluster boundary detection,core cluster identification and clustering of the remaining data objects.Firstly,a new density-based cluster boundary detection method is proposed to detecting the cluster boundary.This method not only considers the relative position of the data object,but also considers the local density value of the data object and its neighborhood data,which improves the accuracy of cluster boundary detection.After using cluster boundary detection to identify the internal and boundary objects of the cluster,the core objects and cluster centers are distinguished from the internal objects,and the core clusters are identified by breadth-first search strategy from the cluster centers.Finally,the remaining data objects are clustered into the existing core cluster by similarity measurement.The method proposed in this paper can well process datasets with different distributions and dimensions,and can also improve the accuracy of clustering results by identifying the boundaries of adjacent clusters with variable-density.According to the experimental results on different datasets,the clustering algorithm based on shared nearest neighbors improves the accuracy of traditional density estimation methods,and reaches or superior clustering performance than the density-based clustering algorithms SNN-DPC,ADPC proposed in recent years.In addition,the clustering algorithm combining density estimation and cluster boundary detection can accurately identify variable-density clusters and detect the cluster boundary of different datasets,which improves the accuracy of clustering results. |