Font Size: a A A

Risk Prediction Of Diabetes Complications Based On Machine Learning

Posted on:2024-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:C XuFull Text:PDF
GTID:2544307118478314Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of socio-economy,the prevalence of diabetes continues to increase,posing a threat to public health.The damage that diabetes causes to the human body is mainly caused by its complications,among which Diabetic Kidney Disease(DKD)is one of the most common.Early diagnosis of DKD is of great significance in improving the life quality and survival rate of patients with diabetes.This study aims to construct a DKD prediction model based on machine learning methods to evaluate whether T2 DM patients are currently suffering from DKD.This study collected patient data from Xuzhou Medical University Affiliated Hospital to construct a T2DM-DKD clinical dataset as the research material,including55 features and one classification label,with a total of 1351 sample data.In terms of data preprocessing,this study used the IQR boxplot method to handle outliers,and constructed a KNN filling model included feature weight calculation to address the issue of MAR missingness in the dataset.Through comparison with mean filling and KNN filling methods,it found that the filling effect of the proposed model was better.This study selected highly correlated features and identified DKD risk factors,used feature selection methods to reduce data dimensions and extracted 18 highly correlated features from the original dataset of 55 features through RFE method as the training set for the algorithm model.Through correlation analysis of features and single/multiple-factor logistic regression analysis,it found that features such as systolic blood pressure and urine microalbumin were independent risk factors for DKD.This study constructed a DKD prediction model based on the Stacking ensemble learning algorithm,analyzed and improved basic framework of Stacking algorithm,used machine learning classification methods to construct DKD prediction models and selected Liblinear LR,RBF-SVM,Ada Boost and Random Forest as the individual learners for Stacking,with SVD-LDA as the secondary learner.Compared with single models,the Stacking model increased accuracy by 5%,F1 score by 4%,and AUC value by 7%.This study validated models using 200 DKD-related data from the NPHDC public dataset.Compared with the Ada Boost and Random Forest ensemble learning models,the Stacking model improved prediction accuracy by 3% and F1 score by 5%.After verification,the Stacking model has the best overall performance in models constructed in this study.The Stacking model constructed in this study can determine whether T2 DM patients were suffering from DKD accurately,and exhibiting good generalization ability through external validation.It can assist doctors in diagnosing DKD patients and provide some reference for expert consultation.
Keywords/Search Tags:Machine learning, DKD, Prediction model, Stacking algorithm
PDF Full Text Request
Related items