Font Size: a A A

Study On The Variable Selection And Qualitative Analysis Of Spectrum Based On Sum Of Ranking Differences Algorithm

Posted on:2020-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:M P NieFull Text:PDF
GTID:2381330578959141Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Building a chemometric model for the spectral information of the sample being measured is a key step in spectral analysis method,the accuracy of prediction result has been directly decide by the chemometric model.The main research of this thesis are models of variable selection and qualitative analysis in chemometric.Most variable selection models of spectral analysis method are based on PLS model,such as PLS with regression coefficient(PLS-BETA),PLS with uninformative variable elimination(PLS-UVE),PLS with variable importance in projection(PLSVIP).The values of latent variable and threshold of variable importance indicators in these variable selection methods need to optimize.researchers usually determine the value of latent variable based on the deviation indicator of the model,and the value of threshold is subjectively selected according to the users' own experience,this method increases the risk of over-fitting of the calibration model and is not objective.This thesis proposes to make the sum of ranking differences(SRD)algorithm combines with multiple model evaluation indicators which can characterize model bias or model variance to automatically and objectively determine the values of latent variable and threshold of variable importance indicators.The algorithms VIP and UVE are used as representatives of variable selection algorithm,related research with public corn near infrared spectroscopy data as experimental data has been down in this thesis.The experiment result shows that the proposed method are better than traditional VIP(UVE)compared with the interpretability of selected variables and the prediction accuracy corresponding to selected variables.Furthermore,it is well studied whether some inferior models in the SRD input matrix will affect the result of which variable selection algorithm model can be selected by SRD algorithm.moreover,this thesis proposes using SRD algorithm combines with classification model to makes a qualitative analysis for spectral data.PLS-DA is used as representatives of classification algorithm,related research with LIBS data of muddy as experimental data has been down in this thesis.The experiment result shows the proposed method are better than the single PLS-DA compared with the classification result.The main contents are illustrated as follows:1.The application of spectral analysis method,some methods of tuning parameter in spectral analysis method,the mechanism of near infrared spectroscopy and LIBS have been introduced.The linear model and variable selection method have also been introduced,both the sum of ranking differences algorithm used in this thesis and some merits used to characterize model bias or model variance have been mainly introduced.2.using SRD algorithm to select the parameter value of variable selection algorithm,related research with the algorithms VIP and UVE are used as representatives of variable selection algorithm has been performed.The public corn near infrared spectroscopy data have been used as experimental data,the sum of ranking differences(SRD)combines with multiple model evaluation indicators which can characterize model bias or model variance are used to select a best model from models corresponding to all possible parameter values,parameter value of the selected model is the final parameter value of VIP(UVE),the method have been named as PLS-VIP-SRD(PLS-UVE-SRD).At the same time,the parameter value of VIP(UVE)has been decided in traditional way,the parameter value corresponding model is traditional PLS-VIP(PLS-UVE)determined model.The PLS-VIP-SRD(PLS-UVE-SRD)and PLS-VIP(PLS-UVE)have been compared about the interpretability of selected variables and prediction accuracy corresponding to selected variables.3.Based on content 2,whether the poor models in SRD input matrix can affect the parameter value of VIP(UVE)selected by SRD algorithm was studied.Some poor models included in SRD input matrix are removed firstly from all possible parameter values corresponding models according to some model indicators.The sum of ranking differences(SRD)combines with multiple model evaluation indicators which can characterize model bias or model variance are used to select a best model from the remaining models.Parameter value of the selected model is the final parameter value of PLS-VIP-SRD(PLS-UVE-SRD)corresponding some poor models are not included in SRD input matrix.4.Proposed to use SRD algorithm combined with classification model to make qualitative analysis for spectral data.The method makes all possible parameter values of classification model corresponding models as the rows of SRD input matrix,the different class of samples are used as columns of SRD input matrix,so the process of tuning parameter occurred in the single classification model have been avoided.The PLS-DA is used as representatives of classification algorithm,related research with LIBS data of muddy as experimental data has been down in this thesis.The proposed method and single PLS-DA have been compared about the classfication consequent.
Keywords/Search Tags:Tuning parameter, Sum of ranking differences, Variable selection algorithm, Model bias and model variance, Qualitative analysis
PDF Full Text Request
Related items