Font Size: a A A

An IDTW Similarity Measurement Algorithm And Time Series Hierarchical Clustering Application

Posted on:2023-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:J N PanFull Text:PDF
GTID:2568306620493934Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the emergence of big data and cloud computing,time series data has become the most common form of data.Time series clustering is a mainstream data mining technology.Different clustering results can be obtained depending on whether the object is a time series dataset or a single long time series data.Among them,the key problem of time series clustering is how to better calculate the similarity between time series.The existing literature has a lot of improvements to the similarity measurement or applied a large number of clustering methods to time series data,but has not pointed out the reasons for the poor similarity measurement effect and the inability to use similarity for reasonable clustering have not been pointed out,which resulted in the clustering effect not ideal.Therefore,this thesis proposes an Improved Dynamic Time Warping(IDTW)algorithm and a hierarchical agglomerative clustering method under the IDTW(IDTW-HAC),and applies this method to the multivariate time series data of photovoltaic microgrids to discriminate working conditions.A series of experiments in this thesis have proved: the effectiveness of the IDTW similarity measurement algorithm,the superiority of the IDTW-HAC method,and the application of the IDTW-HAC method to the multivariate time series data of photovoltaic microgrids can more accurately determine the operating conditions so as to achieve the purpose of fault diagnosis.The main work and innovations of this thesis are as follows:1.Aiming at the problem of calculation similarity between time series data,the problem of precision loss caused by the ”multi-point matching” phenomenon of Dynamic Time Warping(DTW)algorithm is analyzed,and an improved DTW algorithm(IDTW)is proposed.The specific process of the algorithm is given.The nearest neighbor classification experiments are carried out on 14 groups of the UCR time series datasets.The experimental results show that IDTW has higher classification accuracy in comparison with the four similarity measurement algorithms of ED,DTW,CDTW and ACDTW,and to a certain extent the problem of precision loss caused by ”multi-point matching” is solved.2.Aiming at the problem that many clustering methods cannot be combined with similarity matrices,a hierarchical agglomerative clustering method oriented to time series similarity is proposed and the detailed process of IDTW-HAC method is given.In the experiment,6 groups of the UCR time series datasets and five methods including Kmeans,HAC,DTW-HAC,CDTW-HAC and IDTW-HAC,are selected for the whole time series clustering experiments.The final results show that under the optimal parameters,the IDTW-HAC method achieves the best clustering results,and can obtain the maximum value under all clustering external evaluation.3.Aiming at the problems in fault diagnosis of photovoltaic time series data,a subsequence hierarchical agglomerative Clustering(SHAC)method based on IDTW similarity is proposed,and the detailed process of obtaining subsequences,clustering of subsequences,and decision-making of sample points in subsequences is given.The multivariate time series data of photovoltaic microgrid equipment is used to discriminate the working conditions.The experiment consists of two parts: one part is to directly analyze the multivariate time series data of photovoltaic microgrid as static multidimensional data with various clustering methods;The other part is the comparison of subsequence hierarchical agglomerative clustering algorithms under different similarity measures.The final results show that the clustering results of the IDTW-SHAC method achieve the maximum value on a series of evaluation indicators,and the clusters formed are closer to the real working conditions,which can better distinguish the operating conditions for reasonable fault diagnosis.
Keywords/Search Tags:data mining, time series clustering, similarity measurement, photovoltaic microgrid, fault diagnosis
PDF Full Text Request
Related items