Font Size: a A A

Time Series Clustering Based On Gaussian Mixture Model

Posted on:2022-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y YangFull Text:PDF
GTID:2480306743978079Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Time series are ordered data collected from measurements at a uniform time interval and a given sampling rate,which are widely used in various fields.In reality,most of the time series collected are sample data without label information.Data with label information is difficult to collect,and the cost of manual labeling is very huge.Time series clustering is an effective method to analyze a large amount of time series data without any prior knowledge.Its purpose is to di vide a given data set into a set of non-overlapping clusters in a certain way,thereby revealing the underlying layers of the data.However,due to the high dimensionality,high redundancy and non-linear structure of the time series,it is often impossible to obtain satisfactory results when the traditional clustering algorithm is directly applied to this type of data.Appropriate dimensionality reduction and similarity measures have a significant impact on the clustering effect.This article will make innovative improvements to the clustering methods of univariate time series and multivariate time series from these two aspects to further improve the clustering performance.The main research work of this paper is as follows:(1)Most of the existing nonlinear dimensionality reduction methods reduce the dimension from the perspective of preserving the global features and ignore the local linear features of the data set.A time series clustering algorithm based on LLE and Gaussian Mixture Model is proposed.Firstly,from the perspective of preserving local features,LLE is used to represent each sample of high-dimensional time series as a linear combination of its k-nearest neighbors and reconstruct it in the low-dimensional space,so as to achieve dimension reduction while preserving the local geometric structure of data;Then,GMM is used to perform cluster analysis from the perspective of probability distribution.The experimental results on 36 univariate time series datasets show that the new algorithm can obtain better clustering effect in univariate time series.(2)In view of the existence of two dimensions of time and variable in multivariate time series,and the limitations of traditional Principal Component Analys method in data representation of multivariate time series,a MTS clustering algorithm based on the two-dimensional singular value decomposition and Gaussian mixture model is proposed.First,calculate the eigenvectors of the row-row and column-column covariance matrix of MTS,extract the eigenmatrix from the two dimensions of time and variable.Then,use the GMM to cluster the eigenmatrix from the perspective of probability distribution.The experimental results on 13 multivariate time series datasets demonstrate that the new algorithm can gain better results.
Keywords/Search Tags:Time Series Clustering, Gaussian Mixture Model, Local Linear Embedding, Two-Dimensional Singular Value Decomposition
PDF Full Text Request
Related items