Font Size: a A A

A Study On Determination Of The Geographical Origins And Internal Qualities Of Oranges Based On NIR Spectroscopy Analysis

Posted on:2018-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:S J DanFull Text:PDF
GTID:1361330563450980Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
As a fast,accurate,convenient and non-destructive analysis technique,Near infrared spectroscopy?NIR?has been widely used in agricultural product quality detection and origin identification,which is considered a possible replacement of the traditional chemical analysis method for nondestructive testing.At present,the quality identification and origin detection technology of oranges based on NIR analysis is time-consuming and imprecise.There is still a great gap between the expectation from the NIR analysis technique and practical application in the fields of integrity,systemic and interoperability.How to establish an effective technical system to identify the quality and origin of oranges quickly and effectively,which plays an important role in the healthy development of our country.Aiming at the above problems,combined with the requirements of the national Spark Program project--oranges industry information service system,this thesis focused on the analysis technique of origin identification and internal quality of fruit,and proposed variety of rapid and non-destructive oranges origin identification and quality inspection models based on spectral analysis methods using machine learning.The model verification was conducted on the spectral samples collected from 6 different provinces in 16 regions of oranges.The experimental results indicated that the proposed models can improve the effectiveness and accuracy of origin identification and internal quality detection.The main research points are concluded as follows:?1?A general origin identification framework consisted of spectroscopy data preprocessing,feature extraction,feature selection,classification model building and performance evaluation is proposed.A series of experiments based on this framework are conducted:The Savitzky-Golay?SG?smooth method combing with first order and second order derivative is applied to preprocess the data;Information Gain?IG?is used to select the most representative features from that extracted by PCA.In model construction,several of classifiers including decision tree,k-nearest neighbor,na?ve bayesian and linear discriminate analysis?LDA?are investigated to build the classification model for oranges collected from 16 districts.The experimental results show that,SG smooth can generally improve the classification performance of most classifiers;meanwhile,the feature selection method has positive effects on classification results.The performance of LDA model is most stable and it achieves the best prediction result at 92.8%among the tested models.?2?The classification models of geographical origins based on support vector machine?SVM?and genetic algorithm support vector machine?GA-SVM?are proposed.First,different types of SVM kernels in classification models are investigated and RBF kernel is proven to perform best.Secondly,the parameter settings in SVM are also analyzed and discussed in detail and the best parameters in SVM are obtained by gird search method,which result in the best prediction results at 93.52%.To further improve the performance of SVM,genetic algorithm is applied to find the best feature subset.The parameters in GA model,including population size,crossover rate and mutation rate,are also discussed in detail and studied to achieve the best fitness value in our study.The experimental results indicate that GA-SVM can significantly improve the prediction rate of SVM.?3?The classification model of geographical origins based on L1-norm linear regression classification?L1-LRC?is proposed.L1-LRC classification is based on minimum reconstruction error using the L1-norm regularization learning method,which can combine the feature selection and classifier learning organically;it also can reveal the the structure characteristics of spectral informaiton more effectively.The experimental results show that L1-LRC achieves the relatively higher precision rate and peforms much better than other tested models when using only few training samples.Thus,this work lead to a new thinking for fast and efficient NIR spectral classification of geographical origins.?4?The determination model of internal qualities of oranges based on the least angle regression?LAR?is proposed and the quality parameters including total soluble solids?TSS?,titratable acidity?TA?and vitamin C?VC?content are investigated.Compared with existing non-liner and liner models,i.e.,LS-SVM and PLS,the proposed LAR model generates the best prediction results and performs better than conventionalPLS.In aspect of computational complexity,LAR and PLS are better than LS-SVM model.In aspect of interpretability,the proposed LAR is superior to PLS model.Although the precision rate of LAR is worse than LS-SVM,it has advantages for model realization,computation complexity and interpretability over LS-SVM.Thus the proposed LAR model can be applied effectively in the determination of internal qualities of oranges based on NIR spectroscopy.
Keywords/Search Tags:NIR spectroscopy analysis, Machine learning, Determination of the geographical origins of oranges, Determination of internal qualities of oranges
PDF Full Text Request
Related items