Font Size: a A A

Density Peak Clustering Algorithms And Its Application In Dam Monitoring Data

Posted on:2021-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhuFull Text:PDF
GTID:2392330623467281Subject:(degree of mechanical engineering)
Abstract/Summary:PDF Full Text Request
With the continuous improvement of the dam monitoring system,the number of dam monitoring points is increasing,the data of each monitoring point is accumulated in a large amount,and the accumulated data have strong correlation with each other.Effective data mining methods are needed to quickly and accurately mine the correlation of these complex monitoring data,screen out typical monitoring points,and accurately grasp the dam safety status.Clustering by fast search and find of density peaks(CFSFDP)is a newly proposed data mining method.It has lower requirements on data set distribution and is not sensitive to noise.It can quickly realize accurate clustering of arbitrary shape data sets.This method has significant advantages in the processing of complex data,but it has deficiencies in practical data analysis applications.Based on this,this paper introduces the extension correlation function into the density peak clustering algorithm,and overcomes the problem of a large number of connection errors in the clustering process.The evolutionary algorithm is used to further optimize the improved algorithm to overcome the stability problem in the running process.Finally,the improved algorithm is applied to the processing of dam monitoring data,and the typical stress and strain monitoring points like dam deformation and crack are accurately screened out.The full text of the research is as follows:(1)For the CFSFDP algorithm,when determining the cluster center and assigning non-clustered points,it is prone to faults.This paper proposed a density peak clustering based on extension correlation function optimization Algorithm(Extension Correlation Function-CFSFDP,referred to as EC-CFSFDP).This paperimproves the algorithm from two main aspects of cluster center selection and non-clustered point allocation strategy to reduce the sample connection error effect.The concept of average difference degree is introduced into the accurate selection of cluster center as the measure method of sample density,so as to avoid the interference of multiple similar density sample points on the cluster center selection.Based on this,the normalized decision function is proposed to improve the variable distribution and ensure the accurate selection of cluster center.For the non-clustered point allocation strategy,the extension dependent function is introduced to replace the traditional distance-based similarity measurement method.Based on the concept of sample point k neighborhood,the sub-domains and classic domains of each cluster are constructed.Then obtain the value of the extension correlation function representing the similarity of the sample points.And then,base on the size of the value,completes the precise clustering of the non-clustered points,and reduces the connection error effect.Through the comparison and analysis of experiment,the time complexity of the improved algorithm does not increase,but the clustering accuracy of EC-CFSFDP algorithm is significantly higher than that of CFSFDP algorithm.However,due to the lack of scientific basis for the selection of k neighborhood values in the allocation strategy,it has a certain impact on the stability of the algorithm.(2)This paper proposed an automatic clustering algorithm for the correlation function of k-neighbors(AUTO EC-CFSFDP,referred to as AEC-CFSFDP).The genetic algorithm is used to optimize the k-neighbor value to overcome the stability of the k-neighbor value in the EC-CFSFDP algorithm.By introducing cluster similarity and inter-cluster similarity index,the clustering effect balance criterion function is defined as the objective function to measure the k value in the iterative process.Adaptive crossover probability and adaptive mutation probability are introduced into the processes of selection,crossover and mutation to reduce the influence of crossover and mutation probability on population diversity and improve the convergence speed of the algorithm and obtain the global optimal k value.And then pass the k value into the EC-CFSFDP algorithm to automatically complete the clustering.The experimental results show that the method does not increase the time complexity,overcomes the problem of stability of the EC-CFSFDP algorithm.The comparison and analysis of experiment shows that the accuracy of AEC-CFSFDP algorithm is higher than EC-CFSFDP algorithm and IDPCA algorithm,DBSCAN algorithm and k-means algorithm.It has low requirements on the state of sample distribution,can achieve efficient clustering of data sets with different distribution forms,and the algorithmtime complexity does not increase.(3)The improved algorithm is applied to the selection of typical monitoring points of dams in a hydropower station.The missing values of the monitoring data are processed by the mean interpolation method and normalized.The pre-processed data were clustered from the monitoring points of dam displacement,crack and stress and strain,respectively.And then select the typical monitoring points.The typical monitoring points selected in this paper are compared with the typical monitoring points selected by DBSCAN algorithm,OPTICS algorithm and k-means.The smaller the average total error,the more accurate the typical monitoring points selected.The experimental results show that the average total error of the typical monitoring points selected by the AEC-CFSFDP algorithm in the dam displacement,crack and stress and strain monitoring data is significantly lower than that of the other three algorithms,indicating that the cluster analysis method has good applicability and high accuracy in dam monitoring data.In this paper,combined with the conception of dependent function and evolutionary algorithm,the shortcomings of traditional density peak clustering algorithm are improved.The experimental analysis proves that the improved algorithm overcomes the connection error and improved the stability of the algorithm.Real-time monitoring and analysis of dam deformation,crack and stress and strain monitoring data is the key to ensure the safe operation of dams.Apply the improved density peak clustering algorithm to the mining of dam monitoring data with complex correlations,and the typical monitoring points are selected to provide targeted guidance for the operation management of dam workers.
Keywords/Search Tags:clustering algorithm, density peak, extension correlation function, k-neighbor, dam monitoring
PDF Full Text Request
Related items