| With the rapid development of China’s population and economy,as well as the rapid development of mineral resources and industrialization,the soil health is seriously threatened.However,traditional soil physicochemical properties detection involves complex processes such as digestion,separation,concentration and extraction of strong acid,which require experienced professional surveyors to operate.At the same time,the complexity of the measurement process greatly increases the consumption of time and chemical reagents,which increases the measurement cost and is impossible to describe the dynamics of physicochemical properties of soils over large areas.Therefore,it is great significance to explore a fast,simple and environmentally friendly method for soil physicochemical properties detection.In this paper,145soil samples covering 50 parent materials from 17 regions in Guangdong province were studied,contents of As,Cd,Pb,cation exchange capacity(CEC),Fe2O3,clay and silt in soil were analyzed and predicted,and a new method of Vis-NIR was established to detect the physical and chemical properties of soil.The main research contents are as follows:(1)Study different spectral pretreatment methods to find the best pretreatment method for various physicochemical properties of multi-source soil and improve the prediction accuracy of the model.Firstly,performing multiplicative scatter correction(MSC)on the Vis-NIR spectroscopy of145 soil samples to reduce the noise caused by soil surface scattering effect,and comprehensively considering the concentration and spectral information of soil samples by using the Monte Carlo-Partial Least Squares Regression(MC-PLSR)which can remove the abnormal samples with various physical and chemical properties.Then process the MSC spectra of the remaining samples respectively by using first derivative(FD),second derivative(SD),standard normal variate transformation(SNV)and discrete wavelet transform(DWT)and use the Partial Least Squares Regression(PLSR)to establish models for soil physicochemical properties detection based on different pretreatment methods and select the best pretreatment method for each physicochemical property.The results showed that the selection of pretreatment methods had a great influence on the results of PLSR model,and MSC+FD showed the best performance in Cd and Pb prediction,and Cd showed the most significant improvement(RP2increased by8.36%,RMSEPdecreased by 18.53%).MSC+SNV showed the best performance in As,CEC and silt,the improvement effect of silt was the most obvious(RP2increased by 2.34%,RMSEPdecreased by 5.69%).MSC+DWT showed the best performance in the prediction of Fe2O3and clay,in which Fe2O3showed the most obvious improvement(RP2increased by 1.27%,RMSEPdecreased by 2.33%).(2)Research band screening algorithm to select the key bands of various physicochemical of multi-source soil and analyze the mechanism of detection of soil physicochemical properties based on Vis-NIR.Screening out the best pre-processed bands of the spectrum by random frog(RF)and competitive adaptive reweighted sampling(CARS)and establishing the PLSR model for the detection of soil physicochemical properties.Combined with the soil spectral data information and the concentration values of soil physicochemical properties,analyzing the correlation between the spectrum and physicochemical properties,and screening out the relevant bands of various physicochemical properties.And compared with the key bands selected by the band screening algorithm to analyze the detection mechanism of soil physicochemical properties based on Vis-NIR.The results show that the key bands screened by CARS and RF are highly overlapped with relevant bands,but CARS band screening algorithm is more suitable for the detection of multi-source soil physicochemical properties,which can greatly improve the prediction performance of the model while reducing the amount of model data.In addition,the key bands of each property coincide with the relevant bands:the relative bands of Fe and CEC are consistent in the range of 420-520nm;the relative bands of clay are highly coincident with those of As,Pb and Fe2O3,and the relative bands of silt are more coincident with those of As and Cd.These results prove that the soil particle size distribution will affect the absorption of As,Pb and Cd.Therefore,it is inferred that Vis-NIR indirectly realized the detection of heavy metal content through the detection of particle distribution,CEC and other information during the detection of soil physicochemical properties.(3)Establishing the best prediction model of multi-source soil physicochemical properties based on stack generalization algorithm.In order to verify the universality and effectiveness of the key bands,support vector regression(SVR)and K-nearest neighbors(KNN)prediction model were respectively established based on the key band data of multi-source soil physicochemical properties obtained in the early stage and then compared with SVR and KNN prediction models based on full spectrum.On this basis,the idea of ensemble learning is introduced,the PLSR,SVR and KNN learners are integrated by stacking generalization(SG),and the results of the three learners are comprehensively analyzed by logistic regression(LR).Finally,a strong robust and high generalization learner is obtained by integrating the advantages of the three learners.The results show that the key bands screened by CARS algorithm have strong representativeness and generalization,which simplifies the data volume and model complexity and improves the prediction results of the model.Compared with PLSR,SVR and KNN,SG improved the correlation coefficient of the prediction model of various physicochemical properties of multi-source soil on the basis of the three base-learners,reduced the prediction deviation,improved the credibility and generalization ability of the model,and finally the prediction results of As,Cd,Pb,CEC,Fe2O3,clay and silt are RP2=0.7644 and RMSEP=2.4480,RP2=0.8024 and RMSEP=7.1156,RP2=0.7504 and RMSEP=5.6627,RP2=0.8018 and RMSEP=1.08790,RP2=0.9210 and RMSEP=3.4372,RP2=0.8206 and RMSEP=22.5730,RP2=0.8028 and RMSEP=34.1108.Therefore,the prediction model of multi-source soil physicochemical properties based on the key bands screened by CARS combined with stack generalization is effective and feasible,which can achieve green,accurate and rapid detection of multi-source soil physicochemical properties.This study provides a new idea for the practical application of large-scale soil physicochemical properties detection. |