Font Size: a A A

Research On Method And Application Of Wavelength Selection Based On Minimum Correlation Coefficient

Posted on:2022-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:J H ChengFull Text:PDF
GTID:2491306320971639Subject:Agricultural Electrification and Automation
Abstract/Summary:PDF Full Text Request
As an efficient measurement and analysis technology,near-infrared spectroscopy quantitative analysis technology is widely used in agriculture.However,near-infrared spectroscopy data is a kind of high-dimensional data,and its dimensionality is generally from a few hundred to thousands of dimensions.If it is directly modeled,the collinearity problem of high-dimensional data will seriously affect the model accuracy.Therefore,the characteristic wavelength selection algorithm has become a research hotspot in near-infrared spectroscopy in recent years.This study analyzes the advantages and disadvantages of three commonly used wavelength selection algorithms(successive projections algorithm SPA,correlation coefficient CC,competitive adaptive reweighted sampling method CARS),and finds that SPA can eliminate the collinearity problem between variables,but SPA only considers Projection between vectors,and variables with large projection distances are not necessarily valid variables,and may contain no information or interference variables;CC considers the correlation between variables,but its disadvantage is that only the relationship between wavelength and physical and chemical properties is considered,does not consider the collinearity between the spectral data,and the problem of non-information and interference variables in the spectral data;the disadvantage of the CARS algorithm is that the stability is poor,and the samples are randomly selected through MC sampling during operation,which is random,So that the result of each selected characteristic wavelength is different.In view of the above advantages and disadvantages,this study proposes a near-infrared spectrum wavelength selection to eliminate collinearity among variables.The method is called the minimum correlation coefficient method(Minimal Correlation Coefficient,MCC).This method selects the wavelength points of the spectral data based on the minimum correlation coefficient between variables.Starting from the correlation coefficient matrix,the wavelength points with the smaller average and standard deviation of the correlation coefficients are selected as the candidate modeling wavelength set,so that the wavelengths in the set are the linear correlation between them is the smallest,and the collinearity between the model variables is eliminated to the greatest extent.At the same time,considering the maximum influence of input variables on response variables,the wavelength with greater influence on dependent variables is selected through standard regression coefficient.Then the linear regression model is established by the forward selection method to obtain the optimal prediction model.In order to verify the effectiveness of this method,based on two NIRS sets(soil data set,diesel data set),MCC wavelength selection was established to establish a multiple linear regression model,and the full spectrum(FULL-PLSR)and the modeling results of three commonly used wavelength selection algorithms(SPA-MLR,CARS-PLSR,CC-MLR)for comparison.The results show that the MCC wavelength selection algorithm under the two types of data sets has good predictive performance.Among the prediction results based on the soil data set,the Rp~2 of the MCC-MLR model is 0.9265,and the RMSP is 1.0323;the CARS algorithm has the best prediction effect among the other three wavelength selection algorithms,and the Rp~2 of the CARS-PLSR model is 0.9088.The RMSEP is 1.3682.Based on the prediction results of the diesel data set,the Rp~2 of the MCC-MLR model is 0.9560 and the RMSEP is 2.7792;and the best prediction effect among the other three wavelength selection algorithms is the SPA algorithm,the Rp~2 of the SPA-MLR model is 0.9539,and the RMSEP is 2.8449.The conclusion shows that MCC can achieve efficient dimensionality reduction and improve the prediction performance of the model.It is a feasible wavelength selection algorithm.The software system of near-infrared spectroscopy modeling method based on minimum correlation coefficient wavelength selection is established through MATLAB,which can realize the functions of data import,spectral data preprocessing,sample set division,wavelength selection and establishment of multiple linear regression models.Provide users with a concise and quick human-computer interaction window to achieve rapid and accurate determination of the physical and chemical properties of samples.
Keywords/Search Tags:Wavelength selection, Near-infrared spectroscopy, Multivariate calibration, Minimal correlation coefficient
PDF Full Text Request
Related items