Font Size: a A A

The PLS Variable Selection Method And Its Application

Posted on:2008-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y LinFull Text:PDF
GTID:2121360242979515Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
In this dissertation, an approach called partial least squares (PLS) variable selection was described for variable selection in PLS modeling. The aim of the method was the deletion of unimportant PLS variables by using information from the error function. The proposed approach was applied to practical and obtained promising results.Chapter 1 was the general introduction of the developing history and research field of chemometrics, regular pattern recognition and variable selection. Through application of the research works in our laboratory, the purpose, significance and main contents of the paper were also summarized.Main principle of the PLS regression and PLS variable selection method were demonstrated in chapter 2. The illation, principle, etc. of the method were presented. The information including regression coefficients etc. from PLS modeling was used to select original regression variables, to eliminate some unimportant or uninformative variables and to obtain the simpler model without loss prediction power.Chapters 3 to 6 were the The applications of PLS regression and PLS variable selection method were introduced in chapters 3 to 6. Firstly, The PLS method was applied to deal with the GC-MS data obtained from seawater samples of the primary polluted sea area of Jiaozhou bay and Laizhou bay. The classification models were built for seawater samples from different contaminated areas. The cross validation relative coefficient of the model came to over 0.91. This method can provide a reliable result for distinguishing pollution sources correctly. Moreover, the classification figures in the article were plotted more clearly and intuitively. Secondly, the PLS variable selection method was applied to build the prediction model which was used to distinguish Radix et Rhizoma Glycyrrhizae in different grow conditions, the model obtained was simplified greatly compared with the traditional method. Thirdly, in order to enhance the modeling function, the PLS variable selection method was improved, and the VBA program was modified. Combining with the expansion of variable dimension technology, the improved method was used to deal with the following two problems: In chapter 5, when dealing with the data of 244 heroin samples by ICP-MS from three sources, namely Kunming, Simao and Xishuangbanna in Yunnan province, the discrimination accuracy of this model reached 95% and more, Due to the minority of variables(less than 10) in this model, it was easy to analyses or explained the variable effects to the model. Hence this model can be used to discriminate and identify the drug sources effectively. In chapter 6, when dealing with the data of 138 normal persons' hair samples by atomic spectrometry from people in different ages and sexes in Xiamen city, the discrimination accuracy of this model increased approximately 30% more than that of the traditional methods.Conclusions and future prospect for this research were summarized in the last chapter. A new criteria of deleting unimportant variables was found, which can be easily calculated and used. The results of all the practice application indicated that, the variable selection method adopted the new criteria of deleting unimportant variable from PLS can be practical and effective. This method was suitable to deal with the problem with huge amount of data. By using this method, the important variables were picked, thus the model simplified, which would be more convenient for data analysis and practical applications.
Keywords/Search Tags:Partial Least squares(PLS), PLS Variable Selection, Variable Dimension Expansion, VBA, Classification Model
PDF Full Text Request
Related items