Font Size: a A A

A Study Of Two Machine Learning Methods To Construct A Risk Prediction Model For Nonalcoholic Fatty Liver Disease

Posted on:2023-01-15Degree:MasterType:Thesis
Country:ChinaCandidate:J L ZhangFull Text:PDF
GTID:2544306821950209Subject:Internal medicine
Abstract/Summary:PDF Full Text Request
Objective: The risk prediction models of nonalcoholic fatty liver disease(NAFLD)are constructed and evaluated by using CRT classification tree and nomogram.The purpose is to screen the high-risk population with NAFLD through machine learning method and provide basis for the prevention and early diagnosis of NAFLD.Methods: The general data of 10390 people with complete data examined by the physical examination department of the Affiliated Hospital of Guilin Medical College from July to October 2017 are collected,including name,sex,age,height,weight,past history,medication history and personal history,laboratory indicators include triglyceride(TG),total cholesterol(TC),low density lipoprotein cholesterol(LDL-C),high density lipoprotein cholesterol(HDL-C),fasting plasma glucose(FPG),alanine aminotransferase(ALT),aspartate aminotransferase(AST),uric acid(UA),creatinine(Cr),urea nitrogen(BUN),and body mass index(BMI),triglyceride glucose index(Ty G)and glomerular filtration rate(GRF)are calculated.70% of the population(7273cases)are randomly selected as the training set and 30% of the population(3117cases)as the validation set for internal validation.The population are divided into case group and control group with or without fatty liver,CRT classification tree model and nomogram model are constructed respectively.Draw the identification ability of the receiver operating characteristic curve(ROC)evaluation models,the decision curve analysis(DCA)and clinical impact curve(CIC)are used to evaluate the clinical practicability of the models,and the calibration curve is used to evaluate the accuracy of the models.Results:1.Compared with the control group(2148 cases),the levels of BMI,SBP,DBP,FPG,TG,TC,LDL-C,Ty G,ALT,AST and UA in the case group(2148 cases)increased,while the levels of HDL-C decreased(P<0.05);2.There is no significant difference in age,FPG,TC,LDL-C,HDL-C,Ty G,BUN,Cr,UA and GRF between the training set and the validation set;3.The results of CRT classification tree model suggests that BMI,Ty G,ALT,FPG,age and UA are related to NAFLD.The results of nomogram model shows that BMI,Ty G,ALT,LDL-C,FPG,age,UA,HDL-C are correlated with NAFLD;4.(1)The area under ROC curve(AUC)values of CRT classification tree and nomogram model in the training set are 0.857 and 0.881 respectively,and the area under ROC curve(AUC)values of CRT classification tree and nomograph model in the validation set are 0.853 and 0.892 respectively,suggesting that the two prediction models have good discrimination ability;(2)The training set CRT classification tree model DCA has a Pt range of 0.06-0.82,and the nomogram model has a Pt range of 0.08-0.85,suggesting that the two models have a good net benefit rate;The Pt range of the CRT classification tree model DCA in the validation set is 0.06-0.85,and the PT range of the nomogram model is 0.08-0.9,which is not significantly different from the training set,which verifies the good clinical practicability of the two models in the training set;(3)The prediction probabilities of coincidence of red curve and blue curve in CRT classification tree and nomogram model CIC of the training set are about 0.85,and the prediction probabilities of coincidence of red curve and blue curve in CRT classification tree and nomogram model CIC of the validation set are about 0.78 and 0.8,respectively,suggesting that these two models have high clinical impact value;(4)The p value of the calibration curve of the CRT classification tree model in the training set =0.590,the p value of the calibration curve of the nomogram prediction model =0.348,both P values are>0.05,and the maximum deviation from the standard curve is 0.003 and 0.087 respectively,indicating that there is no difference between the calibration curve of the prediction model and the standard curve,the P values of the validation set are all > 0.05,and there is no significant difference between the maximum deviation and the training set,suggesting that the two prediction models have high accuracy.Conclusion: 1.BMI,Ty G,FPG,LDL-C,HDL-C,age,ALT,UA are associated with NAFLD;2.The NAFLD prediction models based on CRT classification tree and nomogram have good clinical efficacy,and have good predictive ability and high predictive value for the risk of NAFLD.
Keywords/Search Tags:Machine Learning Methods, NAFLD, Classification Trees, Nomogram, Predictive Models
PDF Full Text Request
Related items