Font Size: a A A

Application Of Chemometrics In Complex System Of Traditional Chinese Medicine Analysis

Posted on:2011-05-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y H WeiFull Text:PDF
GTID:1101360305465956Subject:Chemical informatics
Abstract/Summary:PDF Full Text Request
The synthetic drugs have been greatly developed during the 20th century. The successful ratio becomes lower by screening potential drugs from synthetic compounds while the cost increases. People had to return to find new drugs from natural products. Thus, there are many opportunities and challenges for the traditional Chinese medicine (TCM) in the 21st century. Developing with science and technology, the research on TCM will be obtained a new period of great development. There are abundant resources of herbs and has a long history about application of TCM in China. A series of complex issues have been emerged when the TCM is studied deeply step by step, such as:What is the main effective substance in TCM? What is the mechanism of TCM in the process of curing diseases? How does efficiently control the TCM containing hundreds of compounds? These issues have become bottleneck problems that restrict the development of TCM in the process of modernization, and also has become an important research task for researchers. TCM is a complex organic system. The efficacy of the TCM should be contribution of all chemical compounds contained in this system. Therefore, the research on TCM must follow the view of "whole". If the "whole" is ignored in the study, which will deviate from the theoretical system of TCM, and the results obtained might have some limitations and one-sidedness in some extent.In recent years, the views of "Great Quality", "quality with efficacy" have been proposed by some researches, and these views are consistent with the theoretical system of TCM. The views must have important effect on the investigation of TCM. As for settling a complex problem, advanced equipment and the methods of data mining will play an important role. Combining cromatographic, spectroscopic techniques with chemometric methods will provide a feasible strategy for the problems of TCM complex system. In this dissertation, considering the problems existing in the origins, quality control, effective substance, and compounds identifying, some works had been performed combining modern chromatography, spectroscopy and chemometrics.In chapter 1, the main contents were focused on the introduction of the principle about several chemometric algorithms, including support vector machines (SVM), least squares support vector machine (LS-SVM), multiple linear regression (MLR), genetic algorithm (GA), random forest (RF), k nearest neighbor algorithm (kNN), and their application in the study of TCM were also reviewed.In chapter 2, the near infrared spectroscopy (NIR) of different Angelica sinensis samples were obtained, and the original spectral data were pretreated by standardization and first derivation. Random forest (RF) and k nearest neighbor algorithm (kNN) were used to build the classification models of Angelica sinensis. The classification accuracy of training set and test set of RF model was 92.21% and 94.74%, respectively. The responding value of kNN model was 94.81% and 94.74%, respectively, and the classification accuracy of 3-fold cross-validation was 94.81% for kNN model. In addition, nine variables (wave number) were chosen by RF, and the compound containing in the samples was highly relevant with the variables selected. Moreover, the proposed models (RF and kNN) were applyied to predict 30 Angelica samples from different origins. The classification accuracy of RF and kNN model was 80.00% and 86.67, respectively. The results obtained from this study indicated that NIR technique combining with RF or kNN method could discriminate the Angelica origin, and the kNN had better application ability than RF.The contents of ferulic acid and ethanol extract of Angelica sinensis samples were determined by the methods described in the Chinese Pharmacopoeia. Genetic algorithm-multiple linear regressions (GA-MLR) and least squares support vector machine (LSSVM) combined with near infrared spectroscopy were used to establish the quantitative model. As for ferulic acid, the GA-MLR model parameters were R2=0.719, Q2LOO=0.684, RMSEP=0.010, and the GA-LSSVM model parameters were R2=0.649, Q2LOO=0.646, RMSEP=0.014. As for alcohol extract, the GA-MLR model parameters were R2=0.704, Q2LOO=0.664, RMSEP=0.027, the GA-LSSVM model parameters were R2=0.990, Q2LOO=0.574, RMSEP=0.045. The results showed that the predictive ability of GA-MLR models were better than that of GA-LSSVM, which suggested that this complex system was more fit with the linear system.A high-performance liquid chromatography combined with diode array detector and evaporative light scatter detector (HPLC-DAD-ELSD) was developed to determine the contents of eleven active compounds in compound Danshen Tablet. The contents of each compound were regarded as the variables to establish the classification model. The oder of variables were ranked firstly using recursive elimination feature with support vector machines (RFE-SVM), and least squares support vector machine (LS-SVM) method was performed to build the classification model. The important parameters were also optimized in this study. The accuracy of LS-SVM model was 96.67%, and the cross-validation accuracy was 90.00%. Two important variables were selected as ginsenosides Rbl and Rd. Therefore, Rbl and Rd might have major impact in the quality of compound Danshen Tablet; they could be regarded as potential indicators of quality control.In chapter 3, the oxygen free radical scavenging method (DPPH) was applied to test the antioxidant activity of ethanol extract of 56 Angelica samples. The ultraviolet (UV) spectra of each sample were also scanned by an UV spectrophotometer. The GA-MLR method was used to construct the quantitative spectrum-activity relationship (QSAR) between UV spectra and antioxidant activity of Angelica samples. The statistic parameters of the best model proposed for training set were R2=0.721, Q2LOO=0.623, RMSE=0.085, and R2test=0.711, RMSEP=0.080 for test set. The results obtained showed that the GA-MLR model proposed was reliabile and predictive, and the model could be used to predict the antioxidant capacity of Angelica ethanol extract through UV spectra, and could be used to evaluate the quality of Angelica sinensis in efficacy level.In this section, NIR of 27 compound Danshen Tablets were scanned, and the efficacy of samples (bleeding time) in mouse was also investigated. The GA-MLR method was used to construct the QSAR between NIR spectra and active blood efficacy of compound Danshen Tablet samples. The statistic parameters of the best model proposed for training set were R2=0.888, Q2LOO=:0.832, RMSE=0.085, and R2test=0.825, RMSEP=0.112 for test set. The results obtained showed that the GA-MLR model proposed was reliabile and predictive, and the model could be used to predict the active efficacy of compound Danshen Tablets through NIR spectra, and could be applied to evaluate the quality of compound Danshen Tablets in efficacy level.In chapter 4, Pingwei Powder is a respresentive aromatic TCM, it has the function of regulating gastrointestinal motility. In therotery, the volatile components in this formulation have an important contribution to the pharmacological effects. In this study, the volatile components in Pingwei Powder and their sources had been analyzed by GC-MS. The gastric emptying of essential oils from different formulation containing herbs that consisted of Pingwei Powder in rats was investigated by single photo emission computed tomography (SPECT) technique. GA-MLR and GA-SVM methods were performed to establish the quantitative composition activity relationship (QCAR) between the composition of essential oils and their gastric emptying efficacy. The results demonstrated that the essential oil of Pingwei Powder mainly consisted ofβ-Eudesmol, Hinesol, D-limonene and Agarospirol. The essential oils from Pingwei Powder, Magnolia officinalis and Citrus reticulate had strong efficacy in promoting gastric emptying, while the essential oil from Atractylodes iancea had not effect on gastric emptying in health rat. The statistic parameters of GA-MLR model (R2=0.826 and RMSE=4.297) was slightly better than that of the GA-SVM model (R2=0.804, RMSE=4.666), but the results of LOO cross validation of GA-SVM model (Q2L00=0.783 and RMSECV=4.861) was better than that of GA-MLR model (Q2LOO=0.697, RMSECV= 5.664). The preliminary results showed that the essential oil from Pingwei Powder had the gastric emptying efficacy, and the relationship between composition of essential oil and efficacy was fitting for non-linear relationship.β-Eudesmol, and D-limonene selected by GA as the model variables could be considered the main active substance to promote gastric emptying while cyclohexanemethanol might have a role in inhibiting gastric emptying.In chapter 5, an integrated steam distillation extraction apparatus had been developed for extracting essential oils from herbs. The essential oils of Flos Magnoliae, Citrus peel, Mint and Chinese Angelica was respectively extracted by the developed apparatus, and then analyzed by GC-MS. The oil yields and composition were compared with those extracted by traditional steam distillation apparatus. The results indicated that the oil yields of Flos Magnoliae, Citrus peel, Mint and Chinese Angelica increased 42%,39%,25%and 50%, respectively. The composition of essential oil extracted by different apparatus were different and the number increased by new apparatus. Therefore, the new apparatus developed could be an effective extracting apparatus of essential oil from herbal materials.The essential oils extracted from three kinds of herbs were separated by a 5% phenylmethyl silicone (DB-5MS) bonded phase fused silica capillary column and identified by mass spectrometry.74 of compounds identified were selected as origin data, and their chemical structure and gas chromatographic retention times were performed to build a quantitative structure-retention relationship (QSRR) model by genetic algorithm and multiple linear regressions (GA-MLR) analysis. The model predictive ability was verified by internal validation (R2=0.974, Q2LOO=0.970, RMSETrain=0.489). As for external validation, the model was also applied to predict the gas chromatographic retention times of the 14 not used for model development volatile compounds from essential oil of Radix Angelicae Sinensis (Q2EXT=0.984, r2=0.960 and RMSEEXT=0-361). The applicability domain was checked by the leverage approach to verify prediction reliability. The results obtained using several validation paths indicated that the best QSRR model was robust and satisfactory, and could provide a feasible and effective tool for predicting the gas chromatographic retention time of volatile compounds, and could be also applied to help in identifying the compound with the same gas chromatographic retention time.
Keywords/Search Tags:support vector machines, least squares support vector machines, genetic algorithm-multiple linear regression, random forests, k nearest neighbor algorithm, quantitative spectrum-activity relationship, quantitative structure-retention time
PDF Full Text Request
Related items