Font Size: a A A

Density Peak Clustering Algorithm Based On Local Density Optimization And Its Implementation In Power GPS Patrol System

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2392330605454173Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As an important branch of unsupervised learning,clustering algorithms can discover the potential relationship between the internal structure of data sets and sample without any prior knowledge.The clustering algorithm Clustering by fast search and find of density peaks(DPC)was proposed in Science magazine in 2014 and has been highly accepted by many scholars.The DPC algorithm provides a simple and efficient method for quickly finding the cluster center,which solves the problem that most clustering algorithms use a large amount of computing resources for repeated iterations to determine the cluster center.However,due to the simple distribution tactic of DPC algorithm,there are also deficiencies to solve various complex data sets.When dealing with tightly clustered cluster data sets,the DPC algorithm has the problem of misjudgment transmission for the sample points,which is at the common boundary between clusters.At the same time,DPC is difficult to estimate boundary or noise where simples are far from the cluster center.For complex manifold data sets,because the uneven distribution of sample points in the manifold cluster,which will result in multiple density peak points within the cluster.While,the DPC algorithm does not have the ability to merge multiple highly density sub-clusters into a single cluster.In response to the above problems,this article works as follows:(1)For multiple clusters that are tightly clustered together,due to the different local density of the sample distribution in each cluster,DPC is difficult to solve the assignment of the sample points at the intersection of different clusters.DPC difficulty to distinguish the boundary points and noise points.To solve the above problems,this paper constructs the gaussian kernel discriminant based on the local densityinformation of the sample distribution in the cluster.Gaussian kernel in combination with the fast clustering characteristics of DPC algorithm,this paper proposes the Density peak clustering algorithm based on the local density Gaussian kernel optimization,namely the Gauss-DPC algorithm.By verifying on 10 different data sets,Gauss-DPC algorithm is compared with the three classic clustering algorithms,which are DPC,DBSCAN,and K-means.It is verified that the Gauss-DPC algorithm is more precede in divides the sample points between clusters and clusters and constrain the clusters boundary.(2)This paper proposes a Density peak clustering algorithm based on local similarity optimization of K nearest neighbor,namely KNS-DPC algorithm.When the DPC algorithm processes manifold clusters,if there are multiple density peak points in a cluster,the DPC algorithm will divide a cluster into multiple sub-clusters.To solve this problem,the KNS-DPC algorithm uses the similarity between sub-clusters to merge the split sub-clusters,with this method to optimize the clustering effect of manifold clusters.In this paper,the concept of local core cluster is introduced,and the similarity between local core clusters is defined for local core cluster merging,and the transition density between clusters is defined to judge whether sub-clusters in the data set need to be merged.Compared with DPC,DBSCAN and K-means clustering algorithms,this algorithm has carried out comparative experiment on 5 artificial manifold data sets and 5 real data sets.The experiment proves that KNS-DPC algorithm has obvious advantages in dealing with manifold clusters.(3)In this paper,the two improved algorithms are applied to the GPS track optimization module of a power company’s patrol system.In view of the large amount of irregular spheroidal cluster of GPS track points,the Gauss-DPC algorithm was used to extract the cluster center point as the cluster representative,so as to achieve the goal of thinning the track and reduce the rendering pressure of the Front-end.In viewof the irregular manifold distribution of GPS track points,KNS-DPC algorithm was used to identify the manifold track,and the local core points in the cluster were used to represent the manifold track,so as to optimize the track.To sum up,this paper has done a lot of research and analysis based on the density peak clustering algorithm.Starting from the characteristics of the data set to be processed,combined with the advantages of DPC algorithm that can quickly obtain the density peak of the entire data set,the paper proposes the corresponding optimization algorithm for tightly clustered clusters and manifold clusters.Moreover,the improved algorithm was applied to the inspection system of a power company to realize the thinning and optimization of GPS track data.
Keywords/Search Tags:Density Peak Clustering, Gaussian Kernel, Neighbor Similarity, Neighbor Cluster Merging
PDF Full Text Request
Related items