Font Size: a A A

Related Research On The Diagnosis Of Patients With Non-ST-segment Elevation Myocardial Infarction Based On Machine Learning Model

Posted on:2023-06-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:L QinFull Text:PDF
GTID:1524306827459094Subject:Internal Medicine
Abstract/Summary:PDF Full Text Request
Objective: A database was established by including the clinical data of patients with non-ST segment elevation myocardial infarction(NSTEMI)and unstable angina(UA)in the chest pain centers.The accuracy of the initial diagnosis of NSTEMI patients in the database was evaluated,and the logistic regression scheme and machine learning algorithm were selected to construct the NSTEMI diagnosis models.Screening the optimal model by comprehensive evaluation of the performance of the models to improve the accuracy of NSTEMI diagnosis.Methods: Part Ⅰ: 1)The clinical data of patients with non-ST segment elevation acute coronary syndrome in the Chest Pain Center of the First Affiliated Hospital of Xinjiang Medical University and the First Affiliated Hospital of Shihezi University School of Medicine from January 2017 to December 2019 were entered by continuous inclusion.2)NSTE-ACS patients who completed coronary angiography(CAG)within 24 hours in the chest pain center database were selected as the research objects and an experimental data set was established.The accuracy of the initial diagnosis of NSTEMI patients in the experimental data set was evaluated using the results of CAG diagnosis as a measurement standard.3)Based on the experimental data set,the unconditional Logistic regression scheme was used to screen the diagnostic feature variables in the experimental data set,and the Logistic regression diagnostic model was constructed according to the screening results of the diagnostic feature variables,and the model performance was evaluated by using the relevant indicators of the model evaluation.Part Ⅱ: 1)Use the Python 3.6 software function package to preprocess the data in the initial data set,and convert the data into a data format suitable for ML algorithms;2)Three different types of ML algorithms are used to screen feature variables,and the optimal algorithm is selected through the comparison of algorithm performance to complete the screening of diagnostic feature items.3)According to the classification weight and correlation coefficient,the importance of the screened feature items is sorted,and the Shapely value is used to describe the contribution value of each feature item.4)Based on the results of feature item screening,an experimental data set for ML model construction is established,and the set aside method is used to divide the data set into training set,validation set and test set according to the ratio of 8:2.The ML diagnostic model is constructed based on the data of the training set,and the data of the validation set is selected to verify the consistency of the model.Part Ⅲ: 1)The diagnostic features recommended by the guideline and the features screened by the ML algorithm were respectively included to construct ML diagnostic models.2)Based on the same test set data,the model evaluation indicators are selected to evaluate and compare the performance of the models constructed by the two feature item inclusion methods.Through the evaluation and comparison of the comprehensive performance of different types of ML models constructed in this study,the optimal model was further screened to improve the accuracy of NSTEMI diagnosis.Result: Part Ⅰ: 1)A total of 1566 patients with NSTEMI and UA were included in the experimental data set.Taking the CAG diagnosis as the measurement standard,the sensitivity of the initial diagnosis of NSTEMI patients was 88.59%,the specificity was 89.44%,the Youden index was 0.79,and the Kappa value was 0.78,the AUC value of the ROC curve was 0.821(95%CI,0.775-0.868).2)Using the feature screening of the Logistic regression scheme,the variables included in the NSTEMI diagnostic model include angina pectoris,ECG ST segment depression,TIMI score,Hematocrit(Hct),Creatine kinase isoenzyme,Lactate dehydrogenase(LDH),B-type natriuretic peptide(BNP),and Cardiac troponin T(c Tn T)(95%CI,OR=3.467,38.020,1.314,33.745,0.997,1.003,1.000,1.285);3)The diagnostic sensitivity of the logistic regression model was 93.7%,the specificity was 94.21%,the Youden index was873,the Kappa value was 0.84,and the AUC value of the ROC curve was 0.924.Part Ⅱ:1)In the performance evaluation results of feature screening using Random Forest(RF),Select KBest and Extreme Gradient Boosting(XGBoost)algorithms,the average time spent in feature screening of the three types of algorithms was 2.09±0.14 s,0.51±0.07 s and1.85±0.08 s,respectively.2)The importance of the feature items was ranked according to the classification weight and correlation coefficient.The top feature items were c Tn T,LDH,CK,and changes in the ST segment of the ECG(95%CI,0.21±0.15,0.11±0.06,0.08±0.005,0.06±0.007).There is consistency between the Shapely value of the feature variable and the ranking of the importance of the feature item.The result analysis of the heat map shows that there is a strong correlation between the top-ranked feature items.3)The experimental data set established based on the feature screening results included a total of 701 data of NSTEMI and UA patients,and the data set was divided at a ratio of8:2 by the set-out method,of which 476 data were used for the training and validation of the ML model,225 pieces of data are used for model testing.4)In the training and consistency verification of ML models,the learning curve and verification curve of ML models constructed by XGBoost,Random forest,Naive Bayesian,and Gradient boosting machine(GBM)algorithms have excellent fitting and consistency.Part Ⅲ: 1)Compared with the ML model constructed by the NSTEMI diagnostic feature items recommended by the guideline,the AUC values of the ROC curves of the XGBoost,SVM,RF,GBM and logistic regression models established by screening the features through the ML algorithm have been improved.(95%CI,P=0.003,0.04,0.036,0.002,0.041).2)The comprehensive performance of the XGBoost model is better than other ML models.Based on the test set data,the accuracy,precision,recall and F1 score of the XGBoost model for NSTEMI and UA diagnosis are(95%CI,0.95±0.014,0.94±0.0011,0.98±0.003,0.96±0.007,respectively)and(95%CI,0.93±0.017,0.96±0.008,0.82±0.014,0.89±0.014,respectively).The coefficient of determination for the model was 0.72,and the AUC value was 0.97.Conclusion: 1)Based on the experimental data set established in this study,compared with the indicators of the initial diagnostic evaluation,the logistic regression diagnostic model constructed by incorporating multi-feature items has improved the sensitivity,specificity,consistency and accuracy of NSTEMI diagnosis.2)The Select KBest algorithm selected in this study showed good performance in the screening of NSTEMI diagnostic features,and the screening results were consistent with the calculation results of Shapely values.The learning curves and validation curves of XGBoost,RF,NB and GBM algorithms have good fit.3)Compared with the ML diagnostic model established by only including the diagnostic features recommended by the guideline,the model constructed based on the ML algorithm screening features shows better performance.4)The coefficient of determination and AUC value in the PR curve of XGBoost and GBM models are better than other ML diagnostic models.In the model evaluation results based on accuracy,precision,recall rate and F1 score,the XGBoost model shows a relatively balanced performance.
Keywords/Search Tags:Non-ST-segment elevation myocardial infarction, Machine learning algorithm, Machine learning diagnosis model, Auxiliary diagnosis
PDF Full Text Request
Related items