Font Size: a A A

Analysis And Application Of The Influence Of Characteristic Wavelength Selection On Near-infrared Spectroscopy Modeling

Posted on:2024-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiangFull Text:PDF
GTID:2530306917970509Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Near Infrared Spectroscopy(NIRS)analysis technology,as a fast,non-destructive,efficient and reliable analysis method,is widely used in the fields of food,medicine,chemical industry and agriculture,especially in the field of Traditional Chinese Medicine(TCM).NIR spectra analysis can realize the identification of the origin and grade of Chinese medicinal materials and the content detection of medicinal ingredients.However,NIR data is a typical high-dimensional small sample data,and the number of feature dimensions of the data is much larger than the number of data samples.The spectral data will inevitably contain redundant information and interference feature wavelengths,which will increase the complexity of the model,increase the amount of calculation,and reduce the prediction effect.Therefore,the selection of effective feature wavelengths is crucial in NIR.Aiming at the problems of redundant information and interference feature wavelengths in NIR spectra,this thesis explores the influence of different feature wavelengths selection algorithms on NIR modeling.The main contents are as follows:(1)Data collection experiment was designed and data collection platform was built to acquire spectral data of Crataegi Folium and Polygoni Multiflori Radix,using a SupNIR-2700 near infrared spectrometer.(2)Aiming at the characteristics of small sample size,high dimension,and class imbalance in the number of NIR origin discrimination models,a qualitative analysis method based on sparse principal component analysis for feature selection(SPCAFS)and support vector machine(SVM)modeling was studied.This method performs a sparse representation of the data,uses the sparsity of the feature vector to eliminate the redundant information in the data,and uses SVM to establish the near infrared origin discrimination model of Crataegi Folium,so as to realize the north and south origin discrimination of Crataegi Folium.Different from the traditional principal component analysis method,SPCAFS enhances the sparsity of feature vectors through L1 regularization,and introduces constraints to enhance the independence between feature vectors,thereby improving the discrimination between samples.The model uses Accuracy,Precision,and Sensitivity as evaluation metrics,and is compared with three feature selection algorithms of Successive Projections Algorithm(SPA),Regularized self-representation(RSR),and Sparse Subspace Clustering(SSC).And the effectiveness of SPCAFS in the discrimination experiment between north and south origin of Crataegi Folium is verified.(3)Aiming at the characteristics of small sample size,high dimensionality,and local linear global nonlinear characteristics of NIR content detection model,a quantitative analysis method based on scalable one-pass self-representation learning(SOP-SRL)and decision tree-partial least squares regression(DT-PLS)modeling was studied.The method assigns appropriate weights to the loss function of each sample to emphasize the differences in the samples,while adding a graphbased regularization term to improve the representativeness of the feature wavelengths by measuring the local similarity between the samples constructed from the selected feature wavelength.In order to solve the problem of local linearity and overall nonlinearity of the Polygoni Multiflori Radix(PMR)dataset,the DTPLS modeling algorithm is used.Taking stilbene glycosides and anthraquinones in PMR as the research objects.Through comprehensive comparative analysis,the appropriate pretreatment method,sample selection method and feature selection method were selected,and the content of stilbene glycosides and anthraquinones were respectively established.Two detection models for PMR content were developed,and the results showed that SOP-SRL-DT-PLS performed the best among the methods used.(4)To verify the practical application ability of the SPCAFS and the SOPSRL in TCM detection,PLS quantitative analysis models based on SPCAFS and SOP-SRL were respectively designed.Taking ferulic acid,baicalin and wogonoside in Antai pills as the research objects,the representative feature wavelength were selected by SPCAFS and SOP-SRL respectively,and then PLS was used to establish the NIR regression model.Compared with the modeling results of three feature wavelength selection algorithms,such as CCBS,RSR and SSC,the results show that SPCAFS and SOP-SRL can effectively extract the feature wavelength of ferulic acid,baicalin and wogonoside in Antai pills,and improve the modeling accuracy of the regression model.
Keywords/Search Tags:Near infrared spectroscopy, Traditional Chinese Medicine(TCM), Feature wavelength selection, Sparse principal component analysis for feature se-lection, Scalable one-pass self-representation learning
PDF Full Text Request
Related items