Font Size: a A A

Research On The Extraction Of U-shapelets In Time Series Clustering

Posted on:2019-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:Q H MengFull Text:PDF
GTID:2370330596958685Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Time series data widely exist in various fields of our life.It represents the observed values of things in each given time interval,and has the characteristics of high dimensionality and large amount.Time series clustering as an important branch of data mining has been attracted great interest in the last decade.Most time series clustering works focus on the whole time series.However,the presence of noise and other unrelated data will affect the accuracy of clustering.Recently,a u-shapelet-based time series clustering method has been proposed.The method uses local features to distinguish different time series,which not only has high performance of clustering but also offers an intuitive interpretation of clustering results.Although the u-shapelet-based time series clustering methods have achieved some certain results,there are still several problems,which are mainly in three aspects.First,the process of extracting u-shapelets is really time-consuming,particularly for huge datasets.Second,in order to accelerate the extraction of u-shapelets and ignore its quality,it will reduce accuracy of clustering results.Third,the original quality measure of u-shapelets is not accurate enough,which affects the discrimination of u-shapelets.This thesis focuses on the above problems,and mainly studies the extraction of u-shapelets.The main contributions are as follows:(1)In this thesis,we propose a Random Local Search algorithm(RLS algorithm for short).First,for the problem that the extraction process of u-shapelets is time-consuming.RLS algorithm adopts random sampling technology to extract a certain number of subsequences from the exhaustive subsequence space,which greatly reduces the number of u-shapelet candidates.Second,in order to improve the quality of u-shapelets,a local search strategy is introduced to obtain the local optimal subsequence as u-shapelet.We test RLS algorithm extensively on 27 time series datasets.The experimental results show that the RLS algorithm can discover u-shapelets quickly,while improving the accuracy of clustering.(2)In this thesis,we propose an efficient u-shapelets extraction algorithm based on Feature Points(FPs algortihm for short).Unlike the RLS algorithm,FPs algorithm discovers the u-shapelets of time series based on feature points,which can not only filter out a large number of redundant subsequences and effectively reduces the time-consuming of extracting u-shapelets,but also ensure the quality of u-shapelets.In addition,a new measure of u-shapelets is proposed to enhance the discrimination of the u-shapelets by fully considering the compactness of similar time series.Experimental results show that the FPs algorithm has better performance in improving the speed of u-shapelets extraction,while retaining the accuracy and certainty of clustering results.
Keywords/Search Tags:Time Series Clustering, U-shapelets, Subsequences Extraction, Local Search, Feature Point Extraction
PDF Full Text Request
Related items