As one of the complications of type 2 diabetes,diabetic retinopathy has early concealment,chronic progression,irreversibility,and differences in regional medical equipment.In the early stages of illness,patients are often unable to detect the condition in time and take medical intervention measures.At the time of discovery,type 2 DR had developed to the middle and late stages,which greatly increased the difficulty of treatment and the economic burden of patients.Therefore,risk assessment and prediction of type 2DR can provide a scientific basis for guiding high-risk groups to take effective measures to prevent or delay the occurrence of DR,which is of great significance.This paper relies on machine learning algorithms and relevant medical knowledge to screen out the risk factors related to the onset of type 2 DR,and build a type 2 DR risk prediction model.The specific research content is as follows:First,statistical knowledge and filtering random forest parameter optimization methods are used for preprocessing and feature extraction of sample data.23 pathogenic factors related to DR were screened out.Through consulting a large number of data and expert evidence-based research,the rationality of 23 risk factors is analyzed.In addition to the recognized factors related to blood glucose,hypotension,hemorrheologic and nephropathy,as well as factors such as BMI,age,C-reactive protein and hematoidin derived from most studies,lower extremity arterial disease,biliary tract disease,tumor markers and liver function indicators is also a risk factor for the onset of type 2 DR.Second,explore the role of lower extremity arterial disease,biliary tract disease,tumor markers and liver function indicators in the type 2 DR risk prediction model.The data set is divided into basic data and combined data based on whether it contains lower extremity arterial disease,biliary tract disease,tumor markers and liver function indicators.Respectively use it as the input data of the model.Logistic regression and support vector machines are used to construct a type 2 DR risk prediction model based on a parameter optimization single classifier.Analyzing the prediction results of the model,it is found that adding the above factors in the construction of the type 2 DR risk prediction model can effectively improve the prediction accuracy of the model.At the same time,it is concluded that in the combined data modeling,the Gaussian kernel function SVM with parameters C=1 and gamma=0.01 is the optimal single classifier model.Finally,the combined data is used as the input data,and the integrated learning algorithm is used to further improve the screening effect of the high-risk population with type 2 DR.Optimize the parameters of XGBoost,Lightgbm and Catboost through grid search and cross-validation,and establish DR risk prediction models respectively.And use the stacking principle to integrate XGBoost,Lightgbm and Catboost algorithms to further improve the model prediction accuracy.Through model comparison and analysis,it is found that the Type 2 DR risk prediction model based on the Stacking fusion classifier is the optimal model for this study. |