Font Size: a A A

Research And Application Of Improved PLS In Traditional Chinese Medicine Data

Posted on:2020-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q X ZengFull Text:PDF
GTID:2404330590497519Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The basic research of Traditional Chinese Medicine(TCM)is an important part of the modernization of TCM.High performance liquid chromatography and mass spectrometry are usually used to obtain experimental data.Such experimental data usually contains thousands of substances,showing high-dimensional data characteristics;at the same time,due to the limitation of the number of experimental samples,it shows the characteristics of small samples.TCM prescriptions often exhibit multi-component,multi-effects and non-linear characteristics in the process of treatment,and the experimental process of Chinese medicine is complicated,the time is long,the experimental animals are limited,and the experimental errors caused by some objective factors lead to effective experiments.There are fewer data samples.The complexity of data in the field of Chinese medicine makes it difficult to analyze directly using traditional machine learning methods.Therefore,the data needs to be properly processed to make it suitable for traditional data analysis models.This paper mainly deals with traditional Chinese medicine data based on partial least squares optimization.The optimization of partial least squares is mainly based on three aspects: feature selection,nonlinear feature extraction and nonlinear improvement of regression model.The main work of this paper is:(1)A feature-based partial least squares feature selection method is proposed.For the traditional partial least squares method,only the importance of single features and the existence of redundancy and multicollinearity between features are introduced.The statistical correlation between features is introduced into the traditional partial least squares analysis.A feature-based partial least squares model.Firstly,the feature is evaluated by feature correlation,and the feature group is pre-selected,and then put into the partial least squares model for training to evaluate whether the feature group is desirable.The candidate features are evaluated in turn in combination with the forward greedy search strategy,and the candidate features with the smallest objective function are selected to be added to the selected features.The data were analyzed by using Ma Xing Shi Gan Tang Jun medicine for relieving cough,asthma and UCI data sets.The experimental results show that the feature selection method can better find the better feature set.(2)A partial least squares method combining random forests is proposed.For the linear nature of partial least squares,the random forest algorithm combines multiple classifiers,which is adaptive and suitable for nonlinear regression.A random forest is constructed by extracting the independent principal component and the original dependent variable in the partial least squares extranormal model,and reconstructed according to the residual information until the predetermined conditions are met.The experimental analysis was carried out on the Ma Xing Shi Gan Tang Jun medicine asthma test,the Ma Xing Shi Gan Tang Jun medicine cough test and the UCI machine learning data set.The results show that the partial least squares method integrated into the random forest can better express the characteristics of Chinese medicine data.Improve the prediction accuracy of nonlinear data.(3)A partial least squares optimization method based on deep confidence network is proposed.The cross-checking method adopted for the partial least squares method leads to a sharp decrease in the principal component,thereby reducing the accuracy of the regression equation,and the TCM data is particularly sensitive to the selection of the principal component.The method mainly uses the deep learning model to extract the upper layer features of the original data,puts the extracted features into the partial least squares model for multiple linear regression.While evading the selection of the number of principal components,it also reflects the nonlinear structure contained in the TCM data.Constantly adjusts the model parameters until the accuracy conditions are met.The data of Dachengqi Decoction and UCI dataset were used for analysis and analysis.The experimental results show that the partial least squares analysis method based on deep confidence network has good adaptability to TCM data.(4)To analyze the material-based experimental data of Shenfu injection in the treatment of cardiogenic shock,first distinguish the endogenous and exogenous substances of the original data,and pre-process the data using one-way ANOVA to remove those values.With the small change characteristics,eleven supervised univariate feature selection methods are used to classify the remaining endogenous substances in order of importance and take the intersection of important features.The unsupervised feature selection method is then used to remove redundant features.The resulting feature set was used as a biomarker(endogenous substance).Finally,the relationship between exogenous and endogenous substances is analyzed by the obtained biomarkers.(5)Based on the above research results,according to the data analysis needs of the Chinese medicine field,the PYTHON related programming language and development tools were used to design and develop the TCM data analysis system.
Keywords/Search Tags:Partial least squares, Chinese medicine information, Deep Belief Nets, Feature selection, Nonlinear feature extraction
PDF Full Text Request
Related items