| In modern society,with the improvement of automation and the complexity of the chemical production process,the safety of the chemical process is particularly important.If there are problems in the process of chemical production,which are not dealt with timely and effectively,it will cause economic losses due to unqualified products in minor cases and may lead to major safety accidents in serious cases,threatening the personal safety of staff.Therefore,fault diagnosis of chemical production processes using machine learning,data mining and other methods has become a hot issue at present.Cluster analysis is a very effective data mining method,which is widely used in fault diagnosis research.As a widely used clustering analysis algorithm,density peak clustering algorithm has many advantages,such as simple algorithm logic,few required parameters,high scalability and no redundant iterations in the clustering process.Facing large data scale TE chemical process data set,density peak clustering algorithm has some drawbacks.firstly,the clustering center depends on manual selection,and the degree of adaptation is low.Secondly,the time complexity of the algorithm is too high,which cannot meet the requirement of real-time monitoring and diagnosis.Finally,the calculation method of local density is too simple,which may lose more important information,resulting in the decline of fault diagnosis accuracy.Aiming at the above shortcomings of the density peak clustering algorithm,an improved density peak clustering algorithm is proposed in this thesis.Firstly,the sparse search algorithm is used to optimize the calculation of the similarity between each data point and its nearest neighbor data point to solve the problem of high time complexity.Then,combined with the characteristics of data distribution,a new definition method of local density is proposed.The method performs weighting processing by calculating the number of data points within the truncation distance and the average distance from neighboring data points to the point,so that the local density of each data point can contain the spatial feature information of the data distribution,making the local density of the data point more accurate.It can reflect the spatial structure of data distribution to improve the clustering accuracy of the algorithm.Finally,according to the obvious difference between the clustering center points and the non-clustering center points,the decision value combined with the local density and the nearest neighbor distance value is calculated,and the exponential function is used to enlarge the decision value interval between the clustering center points and non-clustering center points,so as to facilitate the algorithm to distinguish the clustering center points.In this way,the automatic selection of the clustering center points are realized,and the adaptability of the algorithm is improved.This thesis studies the fault diagnosis of Tennessee-Eastman(TE)chemical process based on the improved density peak clustering algorithm.The TE chemical process includes three typical transfers and one conversion process in the real chemical process,and there is a relationship between each process.Noise data is added to TE chemical process data set to simulate the actual production environment.In order to realize effective fault diagnosis,we first use the wavelet function to denoise the fault data to eliminate the noise data of the TE chemical process.Then we use the improved density peak clustering algorithm for clustering analysis,distinguish the abnormal state of the production process according to the clustering results,and deal with it in time,so as to ensure the safety of the chemical production process.Experimental results show that the fault diagnosis method based on the improved density peak clustering algorithm proposed in this thesis solves the problem of speed and accuracy of fault diagnosis,reduces the false negative rate and false positive rate of faults,and improves the adaptability of fault diagnosis. |