Font Size: a A A

The Research Of Variable Selection Method Based On Mutual Information

Posted on:2014-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:X X LongFull Text:PDF
GTID:2251330425472914Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
As we know, the emerging of modern analytical instruments and the progress of computer technology have done much to promote the development of Analytical Chemistry and Life Science. Now, we can get a vast amount of data about the samples by the aid of the instruments which is high throughput, such as gene-chip, mass-to-charge ratios of mass spectrometry,and wavelengths of Near Infrared Spectrum or Raman Spectrum. However, it means that we will be confronted with a new problem:how to select informative variables from those large datasets and how to establish corresponding model to analysis and recognize?To propose a solution, we thought up a new method of variable selection,that is MPA-MMIFS. It was based on mutual information and combined with Model Population Analysis (MPA), where the relevance between the input variables and the response is maximized and the redundancy of the selected variables is minimized. Moreover, in order to adjust the variable importance, we also introduced in the regression coefficient of Partial Least Squares Linear Discriminant Analysis (PLS-LDA). Using three real world datasets (Gene expression data of Estrogen, Metabolomics data of Type2Diabetes Mellitus and Near infrared spectroscopy data of vinegar), the proposed method was tested to select variables to establish models, in the meanwhile, both cross validation (CV) and double cross validation (DCV) were used to assess the model.Comparing with other methods (MIFS, MMIFS and GA), the outcomes showed that the proposed method achieved competitive performance.
Keywords/Search Tags:Variable selection, Mutual information, Model populationanalysis, Partial least squares linear discriminant analysis, Crossvalidation
PDF Full Text Request
Related items