Font Size: a A A

Clustering Of Time Series Data Based On Non-negative Matrix Factorization

Posted on:2017-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:Z QinFull Text:PDF
GTID:2348330503486919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information engineering and modern society, there has been more and more in time and space dimensions of data, time series data. However, the time series data and past the static data has a very big difference. First, it is the incremental data, time-series data are often dynamic, incremental reach, including the data objects increase and extend in the timing of each da ta object on. At the same time data with heterogeneous characteristics, for example, the value of these features text, images, relations and time may be on a different dimension, it may some value types, some categories of type, in the clustering process, not simply the sum to fuse these characteristics. Finally, large-scale data problems, due to the scale of time-series data are often huge, traditional algorithms can not meet the time needed to respond to the needs of users.Now there are a variety of methods have been developed to cluster for different types of time-series data. The time series data is different from the traditional data, so traditional clustering algorithms accuracy above time series data is not high enough. Even the accuracy of the algorithm can be guaranteed, but due to the time series high-dimensional, so that the time to calculate is exponentially increasing.Based on the above background problems and the current tim e series data clustering faced, this paper come up with a cluster algorithms of time series data based on non-negative matrix factorization. By NMF can describe the local information to characterize the information contained in the time-series data. Because NMF unique characteristics, compared with other subspace learning algorithm, in the decomposition process its retention local information instead of global information, so in this paper, using the non-negative matrix factorization to representation and description time series data, then improvements in three areas. Because the non-negative matrix decomposition for time series data have slow convergence characteristics, so the first point is to improve the coefficient matrix sparsity, thus speeding up the convergence process at the same time ac hieve the purpose of de-noising; the second is to join a matrix smoothness constraint items in the objective function, using time series data clustering NMF is also using a new way to express timing characterization data, the base model matrix column is represented sample, so based on the continuity of time series data, the requirements of the base matrix The column vectors also have continuity. The third is the base matrix column vectors do dissimilarity computing, because the base matrix corresponds to a collection of models, there is a correlation between them, hoping redundant column vectors is small, that is, their differences more Big is better. Experimental results show that the proposed non-negative matrix factorization algorithm in some time series data clustering improved accuracy above.
Keywords/Search Tags:time series data clustering, NMF, feature extraction, subspace learning
PDF Full Text Request
Related items