Font Size: a A A

Research On Multivariate Time Series Clustering Based On Sparse Inverse Covariance

Posted on:2021-04-13Degree:MasterType:Thesis
Country:ChinaCandidate:W LiFull Text:PDF
GTID:2370330614960424Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Time series data mining is an important and mature research topic,which has been studied well and solves many problems in applications.Multivariate time series(MTS)is widely used in many fields,and how to cluster MTS accurately and efficiently has become a hot research topic.Compared with the univariate time series(UTS),the research of MTS is more challenging because of its high dimensions and the complex dependencies between different variables.Therefore,the traditional clustering method for UTS cannot be directly applied to MTS.At the same time,many applications will generate a lot of undivided time series data.Based on the above exploration,this dissertation studies multivariate time series clustering.The main contributions of this dissertation are as follows:(1)Since most of the research on time series is focused on UTS,and the traditional distance-based method is difficult to be applied to multivariate time series,we propose a novel model-based method—Kullback-Leibler Divergence-based Sparse Inverse Covariance for Multivariate Time Series Clustering(KLD-SICC).We use a multivariate Gaussian model as the data representation and cluster prototype,and Kullback-Leibler divergence as the distance measurement.Technically,first,each MTS is represented by a multivariate Gaussian model,the most important parameter of which is the sparse inverse covariance solved by Graphical Lasso algorithm.Then,inspired by the traditional Kmeans clustering method,on the basis of the multivariate Gaussian model as the cluster prototype,Kullback-Leibler divergence between multivariate Gaussian models is adopted for distance measurement.Compared with traditional MTS clustering methods,KLDSICC performs well in preventing over-fitting and reducing time complexity.The experimental results in the various datasets demonstrate that KLD-SICC outperforms the state-of-the-art algorithms for MTS clustering.(2)In order to simultaneously segment and cluster multivariate time series,a novel model-based approach adaptive state continuity-based sparse inverse covariance clustering(ASC-SICC)is proposed.Here,the log-likelihood distance is applied as distance measurement and the cluster prototype is a multivariate Gaussian model with sparse inverse covariance.Specifically,the state continuity is introduced to make the traditional Gaussian mixture model(GMM)applicable to time series clustering.To prevent overfitting,the alternating direction method of multipliers(ADMM)is applied to optimize the parameter of GMM inverse covariance.Technically,first,the adaptive state continuity is estimated based on the distance similarity of adjacent time series data.Then,a dynamic programming algorithm of cluster assignment by adaptive state continuity is taken as the E-step,and the ADMM for solving sparse inverse covariance is taken as the M-step.E-step and M-step are combined into an Expectation-Maximization(EM)algorithm to conduct the clustering process.Finally,we show the effectiveness of the proposed approach by comparing the ASC-SICC with several state-of-the-art approaches in experiments on two datasets from real applications.
Keywords/Search Tags:multivariate time series, clustering, sparse inverse covariance, multivariate Gaussian model
PDF Full Text Request
Related items