The Research Of Variable Selection Method Based On Mutual Information

Posted on:2014-02-01

Degree:Master

Type:Thesis

Country:China

Candidate:X X Long

Full Text:PDF

GTID:2251330425472914

Subject:Analytical Chemistry

Abstract/Summary:

PDF Full Text Request

As we know, the emerging of modern analytical instruments and the progress of computer technology have done much to promote the development of Analytical Chemistry and Life Science. Now, we can get a vast amount of data about the samples by the aid of the instruments which is high throughput, such as gene-chip, mass-to-charge ratios of mass spectrometry,and wavelengths of Near Infrared Spectrum or Raman Spectrum. However, it means that we will be confronted with a new problem:how to select informative variables from those large datasets and how to establish corresponding model to analysis and recognize?To propose a solution, we thought up a new method of variable selection,that is MPA-MMIFS. It was based on mutual information and combined with Model Population Analysis (MPA), where the relevance between the input variables and the response is maximized and the redundancy of the selected variables is minimized. Moreover, in order to adjust the variable importance, we also introduced in the regression coefficient of Partial Least Squares Linear Discriminant Analysis (PLS-LDA). Using three real world datasets (Gene expression data of Estrogen, Metabolomics data of Type2Diabetes Mellitus and Near infrared spectroscopy data of vinegar), the proposed method was tested to select variables to establish models, in the meanwhile, both cross validation (CV) and double cross validation (DCV) were used to assess the model.Comparing with other methods (MIFS, MMIFS and GA), the outcomes showed that the proposed method achieved competitive performance.

Keywords/Search Tags:

Variable selection, Mutual information, Model populationanalysis, Partial least squares linear discriminant analysis, Crossvalidation

PDF Full Text Request

Related items

1	Research On Variable Selection For Soft Sensor Model
2	The PLS Variable Selection Method And Its Application
3	Quality Detection And Research Of Yongquantangerine Based On Near Infrared Spectroscopy Technology
4	Chemical Cluster Analysis And Linear Discriminant Analysis Method
5	Analysis Of The Influence Of Energy Consumption Structure On Air Quality In China
6	Application Of Pattern Recognition In Rapid Mass Spectrometry Analysis Of Complex Matrix Samples
7	Kriging Model Approach To Modeling Study On Relationship Between Quantitative Molecular Structures And Molecular Chemical Properties
8	On-line Prediction Of NO_x Emission From Coal-fired Boiler Based On Dominant Factor Analysis
9	Research On NIR Model Updating Method With Application In Food Detection
10	Variable Selection Methods And Their Applications In Quantitative Structure- Property Relationship (QSPR)