Font Size: a A A

Correlation Analysis And Variable Selection For Multivariate Time Series Based On Mutual Information

Posted on:2014-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:X X LiuFull Text:PDF
GTID:2230330398950372Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Multivariate time series (MTS) exist widely in every field of production and life. There are always complex and changeable correlations and dependencies between different series and time steps. If we can take advantages of them and analyze the the correlations and dependencies, it would be very beneficial for identifying the irrelevant and redundant variables. Then the input variables can be selected, the model size and computational time can be reduced, and the prediction performance can be improved.We focus on the research about variable dimension reduction based on the characteristics of data and correlation analysis, so that a group of appropriate inputs can be constructed. Because mutual information (MI) can describe both linear and nonlinear relationships without distribution assumptions, we use it for correlation analysis and propose variable selection methods based on MI. To solve the problem of unbalance between relevancy and redundancy in the present single criteria, a stepwise variable selection method is proposed and carried out by two steps:select relevant variables and discard weak relevant variables. It is also applied in the RBF networks’ hidden nodes selection, thus the input and hidden layer can be optimized. As the wrapper methods need to train models iteratively, it is usually time consuming. A wrapper method is proposed which combines MI with the fast training process of Extrem Learning Machines (ELMs). Hence, the size of the input layer and the hidden layer is determined automatically. The accuracy for MI estimation is very important for variable selection, and the estimation of joint probability density is the difficult point. To deal with this difficulty, we convert the estimation of MI to the estimation of Copula entropy, and propose a parametric method based on different kinds of Copula functions and a nonparametric method based on truncated k nearest neighbor. And the nonparametric method is applied for variable selection of the meteorological series in Dalian. To do a better classification for a kind of MTS represented by matrix, we use MI for feature extraction and introduce a new definition called variable separability based on the concept of class separability as redundancy criterion. Simulation results on EEG show that the proposed method picks out the variables which are best fit for classification and improves the classification accuracy as a result.
Keywords/Search Tags:Multivariate Time Series, Correlation Analysis, Variable Selection, MutualInformation
PDF Full Text Request
Related items