Font Size: a A A

Construction Of A Aiagnosis Model Of Coronary Heart Disease With Phlegm And Blood Stasis Syndrome Based On Machine Learning And Study On Objective Quantitative Indicators Of Tongue Image

Posted on:2024-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:N J ChenFull Text:PDF
GTID:2544306923982809Subject:Traditional Chinese Medicine
Abstract/Summary:PDF Full Text Request
Objective:A machine learning algorithm was used to construct a diagnostic model for coronary artery disease with phlegm and stasis,and to evaluate the performance of the model classification and model interpretation,and to explore the objective quantification index of the tongue image of coronary artery disease with phlegm and stasis,so as to provide a method and basis for efficient and accurate clinical identification of patients with coronary artery disease with phlegm and stasis.Research methods:1.Patients with stable coronary artery disease were included according to relevant diagnostic criteria.Clinical data of the patients were collected,including information on the four diagnoses of TCM,demographic information,medical history data,and dietary habits.The objective quantitative data of patients’ tongue images were collected using a uniform tongue image collection device and according to a uniform collection process.According to the diagnostic criteria of coronary heart disease with phlegm and stasis,the patients were classified into coronary heart disease with phlegm and stasis and coronary heart disease without phlegm and stasis.2.Microsoft Office Excel was used to clean and organize the raw data,remove abnormal data,duplicate data and erroneous data,and fill in missing data of continuous type by using the mean and missing data of subtype by using the plural.Statistical analysis was performed using SPSS 22.0,and statistical methods such as chi-square test and t-test were used to analyze the demographic information,medical history data,dietary habits and objective quantitative indexes of tongue images of the study subjects.3.The model was constructed using Python 3.8,with whether the coronary heart disease was characterized by phlegm and stagnation as the dependent variable,and the collected basic information,TCM symptoms and objective quantitative indicators of tongue as the independent variables,using Bagging-KNN(Bagging K-Nearest Neighbors),Bagging Decision Tree(Bagging DTC),Adaptive Boosting(Adaboost),Gradient Boosting Decision Tree(GBDT),and Random Forest(Random Forest).Tree(Bagging DTC),Adaptive Boosting(Adaboost),Gradient Boosting Decision Tree(GBDT),Random Forest(RF)and other five machine learning algorithms were used to construct four models based on different 4 models are constructed based on different input features.Model 1 was constructed with qualitative information such as basic information and TCM symptoms as input features,and Model 2 was constructed with quantitative information such as objective quantified indexes of tongue,etc.Model 1 and Model 2 were analyzed,and basic information,TCM symptoms and objective quantified indexes of tongue with feature importance>5‰ were output respectively using the algorithm with the best diagnostic performance.The algorithm with the best performance in the construction of model 1 and model 2 was used to construct model 3,and the combination of basic information,TCM symptoms and objective quantifiers of tongue was output when the model performance was optimal.The classification performance of model 3 was compared with that of model 4 to explore the value of the objective quantifiers of the tongue for the diagnosis of symptoms.The classification performance of the models was evaluated by accuracy,precision,recall,F1 and ROC curve.4.The SHAP method was used to explore the interpretability of the model and to analyze the magnitude and direction of the effect of each feature screened out.Research results:1.Descriptive statistics:In terms of general data,89 patients with stable coronary artery disease were finally included in this study,including 51 cases with phlegm-stasis evidence and 38 cases with non-phlegm-stasis evidence,55 cases(61.8%)were male and 34 cases(38.2%)were female.The mean age of the patients was 64.9±11.51,mainly middle-aged and elderly,and there was no statistical difference between the two groups in terms of gender and age distribution,and the two groups were comparable at baseline;In terms of symptom distribution,the frequency of symptoms were,in descending order,string pulse or slippery or astringent pulse,purple or dark tongue,or stasis spots on the tongue,or varicose veins under the tongue or purple color,body fatness or head and body heaviness,chest tightness or chest pain,greasy moss,purple lips or gums,dark face or sallow face;in terms of objective quantifiable indicators of tongue color,the present study found that the Tongue-a and Tongue-S indices of the Phlegm-Stasis and Non-Phlegm-Stasis groups were different.In terms of objective quantification indexes of tongue color,significant differences were found between the phlegm-stasis and non-phlegm-stasis groups in terms of Tongue-a and Tongue-S indexes(P<0.05);in terms of objective quantification indexes of moss color,significant differences were found between the phlegm-stasis and non-phlegm-stasis groups in terms of Coat-Y indexes(P<0.05).2.Diagnostic feature screening of coronary artery disease with phlegm and stagnation:Model 1 was constructed with the best performance of Adaboost algorithm,outputting a total of 47 basic information and TCM symptoms;Model 2 was constructed with the best performance of Adaboost algorithm,outputting a total of 20 objective quantifiers of tongue,and the two groups of variables were arranged and combined in order of feature importance,with different combinations of variables as input features.Using Adaboost,which performed well in the construction of model 1 and model 2,for model 3,it was found that the model classification performance reached the best when 10 basic information,TCM symptoms and 2 objective quantified indicators of tongue were input.By combining the effects of one-hot coding on variables,a total of 9 diagnostic features of coronary heart disease with phlegm and stasis were finally screened,including:chest tightness or chest pain,greasy moss,string pulse or slippery pulse or astringent pulse,purple or dark tongue,or petechiae or petechiae on the tongue,or varicose veins under the tongue or purple and dark color,purple lips or gums,age(35-44),body fat or head and body heavy,TongueHSV_S(tongue texture(color saturation),CoatRGB_B(blue fraction of tongue).3.Diagnostic model construction for coronary artery disease with phlegm and stasis:the best classification performance of the diagnostic model for coronary artery disease with phlegm and stasis was constructed by Adaboost,with 96.3%accuracy,95%precision,100%recall,97.44%F1,and AUC=0.99.4.Explanatory exploration of the diagnostic model of coronary artery disease with phlegm and stagnant intercondensation:using the SHAP method to interpret the model,basic information,TCM symptoms,and symptoms such as purple lips or gums,purple or dark tongue,petechiae or petechiae on the tongue,varicose veins under the tongue or purple and dark,chest tightness or chest pain,body fatness or head and body heaviness,slippery or astringent pulse,and greasy coating were found to increase the probability of being diagnosed with phlegm and stagnant intercondensation.The probability of the patient being diagnosed with non-phlegm-stasis syndrome is higher when the patient has no greasy coating,no string pulse or smooth or astringent pulse,no chest tightness or chest pain,and is between 35 and 44 years old.As for the objective quantifiers of tongue image,when the value of TongueHSV_S(tongue saturation)was less than 81.14 or more than 98.66,the sample was more likely to be diagnosed with phlegm-stasis intercondensation;when the value was between 81.7 and 89.04,the sample was more likely to be diagnosed with non-phlegm-stasis intercondensation.When the value of CoatRGB_B(blue component of tongue coating)was greater than 83.34,the sample was more likely to be diagnosed with phlegm-stasis intercondensation;when the value was less than 78.84,the sample was more likely to be diagnosed with non-phlegm-stasis intercondensation.Research conclusion:1.The diagnostic model of coronary artery disease with phlegm and stagnation has good classification performance and is valuable for the accurate clinical identification of patients with coronary artery disease with phlegm and stagnation.2.Objective quantitative study of tongue images can help to understand and identify more accurately the evidence of interplay between phlegm and stasis in coronary heart disease.3.Objective quantifiers of tongue images such as tongue color saturation(TongueHSV_S)and tongue blue fraction(CoatRGB_B)can be used as a new type of assessment index for coronary artery disease with phlegm-stasis interconnection,which is different from clinical examination,visualization and pulse diagnosis.4.The use of machine learning algorithm modeling based on the fusion of four diagnostic objective quantifiers,such as tongue image,provides a new perspective to improve the classification efficacy of TCM evidence diagnosis models.
Keywords/Search Tags:Coronary heart disease, Phlegm and Blood Stasis Syndrome, Tongue, Objective quantification, Machine learning, SHAP
PDF Full Text Request
Related items