Font Size: a A A

Extention Of 2DPCA For Classifying Multivariate Time Series With Different Time Lengths

Posted on:2022-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:M LiuFull Text:PDF
GTID:2480306749464334Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the continuous development of The Times and technology,high-dimensional data is prevalent in daily life and work,and brings great challenges to data transportation and storage.Meanwhile,it is not convenient for people to study and apply,so it is nec-essary to reduce the dimension of high-dimensional data.Principal component analysis(PCA)is an existing classic dimension reduction method,but PCA can only reduce the dimension of vector data.When reducing the dimension of matrix data,it is necessary to transform the data into vector data,which will destroy the internal structure of the data.Two-dimensional principal component analysis(2DPCA)is aimed at dimension reduction of matrix data.2DPCA can be directly applied to matrix data,but 2DPCA can only reduce dimension of row or column direction of matrix data.Therefore,bidi-rectional two-dimensional principal component analysis(BPCA)is proposed.BPCA can reduce the row direction and column direction of matrix data simultaneously,and the effect of dimension reduction is better than 2DPCA.In the process of collecting multivariate time series data,the measured time of each variable may be different,so the time dimension of each sample data is inconsistent,that is,the data is of different length.At present,there are two commonly used flexible processing methods,one is to truncate all data to the minimum length,but this process will destroy part of the data structure and lose part of the information,so it will affect the effect of data dimension reduction.Another method is to copy the data of the same sample randomly and fill the data of different lengths as the maximum length.Com-pared with the first method,using the second method to fill the data and then reduce the dimensionality is better.In order to better fill the different length of data,this paper will expand 2 dpca,put forward three kinds of filling algorithm is suitable for different length data,are under the complete data of EM algorithm,under the full data of CM algorithm and missing data,EM algorithm,the three algorithms may fill in the data information of data is used for the longest length.The missing part is denoted as X_m,and the EM algorithm under the complete data needs to input the missing part X_m,optimize the projection matrix W and the low-dimensional matrix Z after dimension reduction,so it is denoted as 2DPCA1-Xm Z algorithm.CM algorithm with complete data needs to input X_m to optimize W,so it is denoised as 2DPCA2-Xm algorithm.The EM algorithm with missing data does not need to input X_m and can directly optimize W and Z,so it is denoted as 2DPCA3-Z algorithm.This paper also expands the 2DPCA model with mean value,and proposes corresponding filling algorithms: M2DPCA1-Xm Z algorithm,M2DPCA2-Xm algo-rithm and M2DPCA3-Z algorithm.To test and verify the advantage of filling algorithm is proposed in this paper,in the real made two experiments on multivariate time series data,the first experiment us-ing ECG data,calculated populated with data and real data of mean square error(mse),to compare various filling method,the experimental results show that the proposed method populate the data,the mean square error(mse)is smaller? The second experi-ment was conducted on four real multivariate time series data: Japanese Vowels,ECG,Wafer and AUSLAN.The data were filled by the algorithm proposed in this paper,and then dimensionally reduced by BPCA.Euclide distance was used to classify the low-dimensional matrix with nearest neighbor.The experimental results of classification show that compared with truncating all the data to the minimum length or randomly filling the data to the maximum length,the filling algorithm proposed in this paper has more advantages,and the classification error rate using BPCA is lower.
Keywords/Search Tags:Two-dimensional principal component analysis, Different Time Length, Filling algorithm, Classification experiment
PDF Full Text Request
Related items