With the support of stoichiometric method,near infrared spectroscopy analysis technology has been widely used in food,pharmaceutical,chemical,textile and other industries.The traditional two-stage calibration method has some problems such as difficulty in selecting preprocessing algorithm,complexity in model establishment,low prediction accuracy and high probability of overfitting.At present,the modeling theory of near infrared spectroscopy on the basis of neural network is not perfect.The existing analysis models based on neural network still have some problems such as overfitting,poor generalization,bad interpretability,difficulty in super-parameters selection,inferior model performance and the neglect of the correlation between substances to be measured.Based on them,this paper carries out the research on the near infrared spectroscopy modeling method based on convolutional neural network,and the main research contents are as follows:The first part introduces the principle and application of near infrared spectroscopy analysis technology and the modeling method based on convolutional neural network and the corresponding evaluation indexes are chosen according to the analysis on the shortcomings of traditional near infrared spectrum modeling methods.In the second part,a new modeling method of near infrared spectroscopy is proposed,which is ATSpec Net,a single component prediction model of near infrared spectroscopy based on convolutional neural network.In this method,convolutional neural networks are used to obtain information from the full spectrum,convolution kernels of different sizes are used to extract feature information of multiple dimensions,and activation functions are used to introduce nonlinear factors into the model.Then,the characteristics are summarized and dimensionally reduced through one-dimensional pooling layer,and the extracted features are mapped to the chemical values of samples to be measured by using multi-layer fully connected neural network.The evaluation function is used to calculate the error between the predicted value and the real value.This error is propagated back in the model,through which the network parameters are updated.The process is repeated until the prediction error of the model converges to a small value and tends to be stable,so as to obtain the best prediction model.The near infrared spectra of public data including corn,diesel,beer and milk are used for simulation tests to build the prediction model.The results show that compared with partial least squares(PLS),support vector regression(SVR)and BP neural network,the root mean square error(RMSEP)in the prediction of corn oil content decreases from 0.060,0.097 and 0.292 to 0.046.Modeling accuracy increases by23.3%,52.6% and 84.2%,respectively.The RMSEP in the prediction of cetane number of diesel oil decreases from 2.138,2.266 and 2.387 to 1.964,and the modeling accuracy increases by 8.1%,13.3% and 17.7%,respectively.The RMSEP in the prediction of beer yeast content decreases from 0.581,0.606 and 3.345 to 0.539,and the modeling accuracy increases by 7.2%,11.1% and 83.9%,respectively.In the prediction of milk protein content,the RMSEP decreases from 0.567,0.582 and 0.671 to 0.216,respectively,and the modeling accuracy increases by 63.5%,64.4% and 67.8%.In the third part,a new method of near infrared spectroscopy modeling,ATMSpec NET,is proposed,which is based on convolutional neural network and is suitable for multi-component prediction.Firstly,the preprocessing module is used to remove the interference caused by noise and invalid information in the spectrum,and then the feature is preliminarily extracted from different spaces of the spectrum through multiple expert networks.Then,the weighted network is used to assign weights to the features extracted from each expert network.Finally,the weighted feature vectors enter different branch networks,and multiple chemical values of the samples to be tested are obtained through the secondary extraction and mapping of the features of the branch networks.The public corn data set is used for simulation and the test model is constructed.The results show that compared with PLS,SVR and ATSpec Net models in the prediction of multiple material components,the RMSEP in the prediction of moisture content in corn decreases from 0.272,0.275 and 0.230 to 0.211.Modeling accuracy increases by 22.4%,23.2% and 8.2%,respectively.The RMSEP of corn oil content decreases from 0.060,0.097 and 0.046 to0.043,and the modeling accuracy increases by 28.3%,55.7% and 6.5%,respectively.The RMSEP in the prediction of corn protein content decreases from 0.332,0.342 and 0.307 to0.239,and the modeling accuracy increases by 28.0%,30.1% and 22.1%,respectively.In the prediction of corn starch content,the RMSEP decreases from 0.586,0.523 and 0.522 to0.428,respectively,and the modeling accuracy increases by 27.0%,18.1% and 18.0%.To conclude,compared with the classical modeling methods,the two models proposed in this paper,which are suitable for single-component prediction and multi-component prediction,perform better on objective indexes.It indicates that this modeling method plays an effective role in improving the accuracy of model prediction. |