Font Size: a A A

Research Of Unsupervised Learning Algorithm On Time Series

Posted on:2014-11-18Degree:MasterType:Thesis
Country:ChinaCandidate:H J LinFull Text:PDF
GTID:2308330461972611Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
A time series is a data collection of observed values taken sequentially in time. Time series data exist widely in the all areas of the life and the practice of production, the study of important information hidden behind time series data has important practical significance. Study topics of time series including:research of supervised learning algorithm, semi-supervised learning algorithm, and unsupervised learning algorithm. The unsupervised algorithm can learn all the time series data set without the training set containing class labels, thus the learning algorithm would be widely researched and applied. Recently, The research of unsupervised learning algorithms in time series data is still in shortage, its many research questions have yet to be solved and improved.At present, there are some main topics of unsupervised learning algorithms in time series data, including time series unsupervised feature extraction、forecast、 classification cluster anomaly detection and other issues. In this dissertation, unsupervised learning algorithms of time series are to be studied deeply, which include the whole time series clustering and unsupervised algorithms of sequence anomaly detection. The main research works as follows:1 Clustering Analysis of Time SeriesThe time series clustering algorithms easily depend on the initial cluster centers, which led to the unstable results of clustering,in order to overcome the problem, this paper introduces the affinity propagation (AP) algorithm, and proposes a time series AP-NN clustering model. Aimed the problem that affinity propagation (AP) algorithm easily produce the cluster whose number does not match the real number of clustering results, the initial cluster would be divided by nearest neighbor again, thus solve the issue of uncertain number which is caused by AP clustering algorithm. Applied AP-NN clustering algorithm to the unequal long time series data set which is generated from shape signature method and UCI isometric time series data sets, the experimental results show that the clustering algorithm can effectively improve the similarity of clustering.2 Unsupervised Anomaly Detection in Time Series DataTo avoid normal samples exist in abnormal clusters, we propose two-stage algorithms. To consider the properties of local and global, we define new anomaly factor for anomaly detection. At present, there is no definition of outlier of time series to be accepted by most researchers. Based on the three exceptional types of time series, we will conduct study for abnormal sequence of time-series deeply. the dissertation use the thinking that normal data is far more than the exception data, it gets the normal clusters and candidate of abnormal clusters from AP clustering algorithm, then puts forward the idea of combining the local outlier factor with global anomaly factor, we obtain a new abnormal factor to measure the degree of abnormality of time series sequence. Also through experimental analysis and demonstrate, it improves the efficiency of detection and shows certain advantages.
Keywords/Search Tags:time series, shape feature, cluster, AP algorithm, anomaly detection
PDF Full Text Request
Related items