Font Size: a A A

Study On Intelligent Quality Detection And Origin Identification Of Tobacco Leaf Based On Near Infrared Spectroscopy

Posted on:2020-04-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:D WangFull Text:PDF
GTID:1363330623462172Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Near-infrared(NIR)spectroscopy is an efficient and fast modern analytical technique,which is an interdisciplinary area of spectroscopy,informatics,stoichiometry and computer technology.Based on infrared communication system and the theory of molecular structure,it realizes the transmission and reception of infrared signal and associates the NIR spectral absorption band and high-energy molecular groups in organic matter.It analyzes NIR spectrum by method of mathematics and informatics and intelligent information processing technology.With the advantages of simple operation,non-destructive testing and multi-component simultaneous testing,it can quickly analyze and process the optical signal and the properties of the material to be measured,and obtain information about the properties and internal quality parameters of the materials.In practical application of engineering,the chemical composition and origin of tobacco leave are the key indicators to evaluate the quality of flue-cured tobacco and to design the formulation.Traditional analytical methods are time-consuming and laborious,and have certain limitations,which is difficult to meet the need of the intelligent development of modern tobacco industry.In recent years,the technology of analyzing tobacco leaves by NIR spectroscopy has been widely concerned by scholars in this field.The analysis method of tobacco leaf internal quality detection and origin identification based on NIR spectrum is realized by establishing mathematical model of corresponding data.However,there are few researches on this kind of modeling analysis nowadays,and the accuracy,robustness and predictive efficiency of the existing models are still far from practical application.The difficulties of NIR spectroscopy are mainly reflected in the following aspects,which are noise such as background interference in the data,the selection of sensitive wavelength,serious overlap of spectral peak,complex internal characteristic information and extremely time-consuming of the model training process.Therefore,the research of modeling methods on tobacco quality detection and origin identification based on NIR spectroscopy is the intersection of practical engineering application and scientific research methods,and has the double significance of practical application and academic research.In this paper,according to the practical application demand and on the basis of in-depth study of relevant technologies and theories at home and abroad and a large number of investigations,to address the above common problems in modeling analysis,an in-depth study is conducted on the tobacco leaf quality detection and origin identification methods based on NIR spectroscopy,and intelligent information processing technology is applied for data mining to improve the accuracy,robustness and efficiency of the mathematical model.The research contents and results of this thesis are summarized as follows.(1)To address the problem that noise in NIR data reduces the prediction performance of regression model,a robust prediction model of least angle regression based on noise factor(NF-LAR)for tobacco leaf quality inspection is proposed.Based on the level of effective weak signal collection,the model takes the noise factor as a kind of punishment to restrict the priority of feature selection by the noise scale of each wavelength.Meanwhile,wavelet transform is used to de-noise the data,principal component analysis(PCA)method is used to extract features and reduce dimensions of the data,and minimum absolute shrinkage and selection operator algorithm is used to realize the sparse regression coefficients.Experimental results demonstrate that the NF-LAR model can process the noisy data well and improve the accuracy and robustness of the model.(2)To address the problem that the selection of sensitive wavelength affects the prediction performance of the model,the identification model of support vector machine based on genetic algorithm(GA-SVM)for tobacco origin is established.Based on the level of effective weak signal acquisition,the model adopts bionic algorithm to optimize the input variables of the model.Meanwhile,the methods of Savitzky-Golay convolution smoothing and PCA are used to preprocess the data,and the grid search method is used for parameter optimization.Experimental results show that GA-SVM model can extract sensitive wavelength related to tobacco leaf origin and has better performance.(3)To address the problem that severe overlap of spectral peak and complex internal characteristic information affect the prediction performance of the model,the classification model of convolutional neural network based on gaussian distribution of specific variance on weights(GDSV-CNN)for tobacco origin is proposed.Based on the level of description for spectral characterization,the model applies deep network to mine the non-significant characteristic information in spectral data and establish the complex mapping relationship between the tobacco origin and the spectral characteristics.In order to avoid the phenomenon of gradient disappearance or explosion,the initialization for the weights of each layer of the network with gaussian distribution of specific variance(GDSV)is applied,which accelerates the convergence speed of the model and further improves the recognition performance of the model.Meanwhile,in order to avoid overfitting,methods of batch standardization,L2 regularization and dropout technology are also adopted in the algorithm.The experimental results demonstrate that the GDSV-CNN model has strong ability to extract complex features and better robustness,and also show that the deep network model is more suitable for information mining and analysis of big data.(4)To address the time-consuming problems happened in deep network during training for big data and retraining the network due to new additional samples,the classification model of broad learning system based on Takagi-Sugeno(TS-FBLS)for tobacco origin is proposed.Based on the level of prediction efficiency of spectral information analysis,the model adopts TS fuzzy subsystem to extract the effective features adaptively,and also applies the pseudo-inverse calculation and incremental learning method to speed up the model training.The experimental results prove that the TS-FBLS model not only has a higher prediction accuracy,but also has a great advantage in training speed,which effectively makes up for the time-consuming problem of deep learning and retraining network.
Keywords/Search Tags:NIR spectroscopy, Intelligent information processing, Data mining, Deep network, Broad learning system
PDF Full Text Request
Related items