| The detection of heavy metal content in the soil is of great significance for soil quality assessment and pollution control.Laser-induced breakdown spectroscopy(LIBS)provides a fast and straightforward analysis method that can simultaneously measure multiple elements.Therefore,it is of great value to carry out a quantitative analysis of heavy metal elements in soil based on LIBS technology.However,LIBS has certain limits in terms of accuracy because of the variance in the chemical and physical properties of the many elements found in soils,referred to as the matrix effect.Moreover,when multivariate regression models are employed in quantitative analysis,the number of variables exceeds the number of samples,resulting in multivariate regression model overfitting.To address this issue,the variable selection method needs to be implemented.This study focuses on the influence of different pretreatment methods on the quantitative analysis of Cr,Cu,Ni,and Pb,and establishes an appropriate hybrid model based on variable selection and multivariate regression methods for quantification of heavy metal elements concentration in fourteen soil samples based on LIBS technology.The main research contents of this article are as follows:(1)In this study,a set of LIBS experimental setup was built based on the principle of LIBS and the spectral characteristics of soil samples.The experimental setup consisted of a laser source,optical system,multichannel spectrometer,delay generator,and sample stage,which can optimize the observation of emission intensity.The experimental system parameters,such as delay time and laser pulse energy,were also optimized.(2)The obtained LIBS spectra were preprocessed using spline interpolation for baseline correction,min-max normalization,and wavelet threshold denoising.It was confirmed by the random forest(RF)model that the combination of three preprocessing methods can enhance the predictive performance of the multivariate analysis.Moreover,when comparing the obtained result by support vector machine(SVM)and Bernoulli na(?)ve Bayes(BNB),the RF model outperformed those two models’performance.The coefficient of determination(R~2)values of Cr,Cu,Ni,and Pb were improved up to 0.91,and the root mean square error of calibration with cross-validation(RMSECV),root mean square error of prediction(RMSEP),relative standard deviation(RSD),and limit of detection(Lo D)also decreased in particular values.(3)The LIBS spectra contain number of variables,and assigning all the information directly will burden the multivariate analysis algorithms.We proposed sequential floating forward selection(SFFS)and Fibonacci recursive feature elimination(FRFE)as variable selection methods,and wavelet neural network(WNN)and generalized regression neural network(GRNN)as multivariate analysis to address the issue.SFFS-WNN model shows good performance compared to the standard RF.The obtained R~2 between 0.9674 and 0.9821,RMSECV values were not greater than 8.44 mg/kg,RMSEP values were not greater than 10.91 mg/kg,RSD values were not greater than 9.67%,and Lo D values were not greater than15.14 mg/kg.Moreover,FRFE-GRNN outperforms the results computed by all the combination models,with R~2 values between 0.9897 and 0.9989,RMSECV,RMSEP,RSD,and Lo D values were not greater than 3.84 mg/kg,4.03 mg/kg,3.01%,and 5.29 mg/kg,respectively.Moreover,we also performed FRFE-GRNN for soil classification to show the superiority of the model and obtained the highest sensitivity,specificity,and accuracy of 99.07%,99.92%,and 99.05%.This study confirms that the quantitative analysis of Cr,Cu,Ni,and Pb concentration in soil based on LIBS technology combined with machine learning,especially SFFS-WNN and FRFE-GRNN,is a feasible model and effective in reducing the redundant variables,which lays the technical foundation for real-time monitoring of heavy metals in subsequent LIBS instrument. |