Font Size: a A A

Research On Multi-scale Representation Of Time Series

Posted on:2020-10-29Degree:MasterType:Thesis
Country:ChinaCandidate:M Q LiFull Text:PDF
GTID:2370330602950650Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the development of the information industry,we have entered the big data era of Internet +,in which a large amount of data is acquired and accumulated.How to extract useful information quickly from massive,complex and diverse time series is particularly important.In order to ensure the accuracy and validity of the information obtained from data mining,it is necessary to represent the time series data effectively.The analysis method based on traditional data and statistics is not suitable for such massive and fast update of big data.Inspired by the concept of scale space,this paper proposes a multi-scale time series representation method to analyze and represent the time series from multiple time scales,in order to facilitate the later comprehensive consideration or selection of appropriate scales for analysis.The nearest neighbor distance(NND)is used as the anomaly score to evaluate the anomaly degree of a given sequence expressed in each scale.The final anomaly scores are obtained by weighing the anomaly scores obtained at each scale.In this way,the time-series anomaly detection framework based on the multi-scale representation method was established.The main research contents of this paper are as follows:Different scales are reflected in different time intervals,such as years,months,days,hours and minutes.Different people or practical problems needs different scales of time series,the multi-scale representation of time series is a direction worth studying.Senior managers pay more attention to high-scale data(in years,quarters,etc.),while front-line workers pay more attention to low-scale data(in minutes,hours,etc.).Therefore,this paper proposes a multi-scale representation method,dividing the time series equally at different time intervals into multi scales,and using the same representation method to extract features of each scale to form a multi-scale data representation.The nearest neighbor distance(NND)of each scale was used as the outlier score to evaluate the outlier degree of the given sequence in each scale.The outlier score of each scale is weighted by weight factor to get the final outlier detection result.Experimental studies on synthetic data and public data show that this method has a higher accuracy than the single-scale representation method,and the F1 score is 1.5 times higher than the PAA method.When people observe something,they tend to observe and analyze it from various perspectives and take the information into comprehensive consideration.Each scale adopts the same representation method,it only focuses on a single type of features,which often leads to the phenomenon of missed detection and false detection.Based on the data representation method of multiple time scales,a multi-scale representation method of multiple features is proposed.Different data representation methods are selected for each scale to extract various features formed in the corresponding scale space.To achieve the purpose of observing and analyzing the same time series from different perspectives and meet the diverse needs of users.Experimental studies completed on synthetic and publicly available data demonstrate that the proposed method discriminates among various anomalies better and exhibits higher accuracy than a single scale method with an average improvement of 61.65% with the minimum and maximal improvement of 21.5% and131.5%,respectively.
Keywords/Search Tags:multi-scale, time series, data representation, anomaly detection
PDF Full Text Request
Related items