Font Size: a A A

Application Of Pattern Recognition Techniques To Analysis Of Infrared Spectroscopy Of Some Natural Products

Posted on:2010-07-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:1103360272496177Subject:Agricultural mechanization project
Abstract/Summary:PDF Full Text Request
In recent years, the infrared spectroscopy analysis, as result of its many advantages, that is quick analysis speed, no pollution, not to need special pretreatment, not to use virulent and the harmful reagent, nodestructive, simple operation, the lower analysis cost, green environmental protection and so on, has made the breakthrough progress in quality analysis of some natural products, especially in the traditional Chinese medicine field. The spectrum area of infrared spectroscopy mainly shows the frequency multiplication absorption of stretching vibration in O-H, N-H and C-H key, which is special suitable for quantitative analysis of functional groups in natural products. But the vibration base frequency of the overwhelming majority organic compound appears in the middle infrared spectroscopy, which is more suitable for qualitative analysis of functional groups and structure of natural products.Multiple linear regression(MLR), the principal components analysis (PCA) and partial least squares regression (PLSR) are the traditional chemometric methods in the infrared spectral analysis. However the massive reports indicated that the non-linear relations often present between the target and the spectrum data, so these linear regression technologies certainly cannot obtain the very good predicted accuracy. But pattern recognition technology, because it has the ability of distinguishing the specimen which the specific object imitates through the computer technology, simultaneously has the very good generation, therefore can be used in the choice and extract of spectral characteristics, the classification and prediction of object, simultaneously the quantitative analysis and prediction of specific component through self-learning and regression technology.However, the traditional pattern recognition methods use one pattern characteristic to apply in all sample classes, and does not differentiate to them. When each characteristic is input match module, it carries on directly match and classification. Therefore, when the pattern is not match with symbol, it is very difficult to judge problems part, and the revision algorithms and parameters adjustment own certain blindness.Then people's experience function is unable to be displayed, it would enhance the recognition rate only through the massive samples learning and unceasing adjusting parameters. Regarding to infrared spectrum, because the complexity and the massive spectrum peak overlap of its spectrum data cause analysis difficulty, its difficulty of pattern recognition is very obvious.If only traditional pattern recognition methods are used, the effective classification may be made with difficuly. What is lucky, after L.A.Zadeh proposed the fuzzy set thought, the fuzzy mathematics method had been introduced in the pattern recognition (i.e. fuzzy pattern recognition). When the recognition system is designed by use of fuzzy technology, it can more widespreadly and thoroughly simulate the thinking process of human brain, then the computer intelligence, the usability and reliability of system can be enhanced.In such cases, the artificial neural network (ANN) has been used with relative success in the spectral analysis because it may willfully approach to the nonlinear function. But ANN suffers critical drawbacks that it easily falls into over-fitting. Simultaneously the ANN model excessively relies on the train sample data, and under the majority situation, the sample data is extremely limited (namely so-called small sample), the prediction ability of ANN model will be weakly. Next, because the spectrum data of samples is usually high-dimensional, it is necessary that the characteristics of the primitive spectrum data must be withdrawn using dimensionality reduction technology for reducing the computation quantity. Otherwise the training time of ANN model would greatly increase, the convergence speed would become very slow, and it couldn't even converge. Recently, as a new pattern recognition method, support vector machine (SVM) has a good theoretical foundation in statistical learning theory. It has been widely applied in the fields of pattern recognition, the time-series analysis as well as the function approximation and so on. Instead of the traditional statistical theory, SVM mainly aims at the small samples, namely the optimal solution is based on the limited sample information, but not on the information that the number of samples tends to infinity. Moreover, SVM models can avoid over-fitting problem, has the superior generalization ability and prediction accuracy.The research point of this paper lies in: in view of some natural products, it was carried out that the infrared spectroscopy analysis was organically combined with several kinds of pattern recognition technology, namely partial least squares, fuzzy pattern recognition, artificial neural networks, support vector machine and grey correlation analysis, to realize qualitative and quantitative analysis for the purpose of seeking one kind of more effective modeling method of infrared spectroscopy and providing the infrared spectroscopy analysis of natural products with new ideas and skill.Take the natural products, that is ginseng, Epimedium Brevicornum and tobacco, as the objects of study, the fuzzy pattern recognition technology to apply in qualitative analysis of infrared spectroscopy was first proposed. simultaneously, the crucial questions, namely dimension reduction of spectrum variables, closeness degree, principle of choosing the nearest as well as analysis steps and so on, had also solved during the process of analysis. The simulation result indicated that the habitat distinction models can basically correctly distinguish 42 Epimedium Brevicornum samples, 40 ginseng samples, and 120 tobacco samples, which is satisfying. Moreover it can avoid the separation and drawing of natural products with traditional spectroscopy analysis, thus offer the effectively and reliable basis for the quality controls and modernized management of natural products.In view of the natural products, namely tobacco and concocted Coptis, the least squares method was studied to realize quantitative analysis of single component and the multi-components together near-infrared spectroscopy, and the pretreatment plan of spectrum data and evaluating index of generalization had been also determined. The simulation experiment indicated that when the partial least squares method was used in the spectral analysis of natural products, it could meet the practical application needs to a certain extent, but the optimization time was excessively long, it was not to suit the small samples and the generation ability was relatively weak, thus its practical application value was reduced to some extent.Take the natural products, that is ginseng, Epimedium Brevicornum, tobacco and concocted Coptis as the examples, the research of artificial neural networks applied in habitat distinction analysis of middle infrared spectrum and qualitative analysis of near-infrared spectrum had been completed, and the key parameters, the solution of related question and effective appraisal to the models had been also carried on. The simulation results indicated that habitat distinction models, regardless of near-infrared spectroscopy or the middle infrared spectroscopy, their distinction accuracy rates achieve above 92%. Simultaneously, the prediction of quantitative analysis models was quite accurate, each evaluating index of models was ideal. At the same time, it was also found that when the artificial neural networks was appllied in the infrared spectroscopy analysis, really, its generation had certain limitation, and it was easily to fall into local optimal problem.Moreover, when the quantity of samples used in modeling was relatively less, the predictive ability of models were obviously weaken.The wavelet transform technique combined the support vector technology were first proposed to realize the qualitative and quantitative analysis work of middle infrared and near-infrared spectroscopy. Simultaneously, the related nuclear parameters and the method of nuclear function choice were discussed and analyzed, which was simulated in single component, the multi-component quantitative analysis of near-infrared spectroscopy of tobacco and concocted Coptis samples, and habitat distinction analysis of infrared spectroscopy of ginsengs and Epimedium Brevicornum samples. Finally, the models by use of different pattern recognition methods were carefully compared. The contrast result indicated that qualitative and quantitative analysis models based on support vector machines, regardless of the near-infrared spectroscopy or the middle infrared spectroscopy, manifest some advantages, namely the good reliability, the robustness, the best distinction accuracy rate, the highest prediction precision, the strongest generation, the shortest modeling time, the fewer manual controlling factors, the most suitable for the small sample, not easy to fall into local optimal problem and so on.Therefore, the support vector machine owns the high practical application value and the broad application prospect in the infrared spectroscopy analysis field.The grey correlation analysis method was first used in the optimization selection of spectrum regions of near-infrared spectroscopy. Through calculating the peak area of some spectrum region and correlation degree of the specific component, the spectrum regions of maximum correlation degree were took as the optimal spectrum region and participated to establish models. The simulation results showed that the modeling time was greatly reduced and the predicting precision was significantly increased. Therefore, this research owns the high application value.
Keywords/Search Tags:pattern recognition, ginseng, Epimedium Brevicornum, concocted Coptis, tobacco, infrared spectroscopy
PDF Full Text Request
Related items