Font Size: a A A

Research On Time Series Data Classification Algorithm Based On Factor Space Theory

Posted on:2022-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:S S XueFull Text:PDF
GTID:2480306722468424Subject:Mathematics and Applied Mathematics
Abstract/Summary:PDF Full Text Request
The main task of data mining is to find out the relationship between the factors in the data.Under the theory of factor space,factors are the elements of causal analysis,and there are two kinds of relations between factors: correlation and causality.Causality especially emphasizes the relationship of "prior cause and consequence",which is especially evident in time series data mining.In the face of time series,how to excavate the features,how to change the corresponding factors,and whether there is a causal relationship between the changing factors and the classification results are all the problems to be considered in the establishment of time series data mining model.Firstly,this paper discusses the factor correlation and causality under the factor space theory,and gives the definition of factor correlation and factor correlation,as well as the conditions for the causal relationship to be satisfied.Because feature extraction is an important step in time series data mining,factors must contain observable features to distinguish the objects under discussion,so the dialectical relationship between features and factors is analyzed.In general,entropy can be used as a measure of the interaction between factors and as an indicator of the importance of features.Therefore,this paper supplements the linear entropy multiple classification formula on the existing binary classification formula of linear entropy.Numerical experiments show that linear entropy and information entropy have the disadvantages of underfitting and overfitting respectively.If ensemble learning is used to make up for the above disadvantages,the linear entropy will not only have low computational complexity,but also can effectively replace the conclusion of information entropy in some machine learning algorithms.Such a conclusion is also proved in the typical C4.5 and random forest decision algorithms.On the basis of the above theory,this paper improves the Shapelets time series algorithm.The original algorithm,as the main algorithm for mining the local characteristics of time series,has high accuracy and strong explanability.However,it also has the disadvantages of high algorithm complexity and difficult to determine the local feature extraction method.The improved algorithm converts data based on one-dimensional convolution,and constructs different decision trees by randomly extracting the same number of shapelets several times.Finally,the category of time series is determined by majority voting.The experimental results show that the Shapelet classification algorithm based on one-dimensional convolution feature extraction not only greatly improves the accuracy of the original algorithm,but also enhances the interpretability of the results.In addition,the convolution kernel with excellent performance can also reflect the hidden autocorrelation characteristics of time series,which provides beneficial help for the further discovery of the change rule of time series.There are 12 figures,9 tables and 51 references in this paper.
Keywords/Search Tags:Factor space, Linear entropy, Feature extraction, Shapelets time series classification algorithm, Factor correlation and correlation
PDF Full Text Request
Related items