Font Size: a A A

Variable Selection Based On PLS And Its Application On High Dimensional Data

Posted on:2014-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:T P TongFull Text:PDF
GTID:2180330422968419Subject:Chemical Process Equipment
Abstract/Summary:PDF Full Text Request
Variable selection, also known as Feature Selection, is one of research hotspots ofthe information pattern recognition field. With the rapid development and wide usedof computer science, the variable selection study has also made great progress. Theoryand application achievements of the statistical methods and machine learning areemerging, some of which in the practical application has shown great potential. Thisarticle focuses on the Partial Least Squares (PLS) for variable selection. PLS is one ofthe most popular multivariate statistical regression analysis methods. Taking intoaccount the widely used of variable selection algorithm in different field. We chooseprocess analysis and bioinformatics data sets as examples to verify the validity of PLSbased variable selection method combined with machine learning algorithms.Considering the practical application, variable selection methods and machinelearning regression algorithm were used as basic tool to handle those issues onprocess analysis and bioinformatics. Important variables were selected andexplanations were made for further research. Guidance was given for regulation. Thenature and mechanism is easily understood.For the Near-Infrared Spectral (NIR) data in the process analysis field, partialleast squares based variable weighted Gaussian process were used to solve data withmulticollinearity and overcome the “information saturation” phenomenon.For the identification of essential genes in bioinformatics, Z-curve were used toextract DNA sequence features first, then an uninformative variable elimination (UVE)based partial least squares classifier were used for iterative variable selection.Essential genes can be found and important feature which related to gene essentialityalso can be chosen.
Keywords/Search Tags:Variable Selection, Partial Least Squares, Gaussian Process, Z-curve
PDF Full Text Request
Related items