Font Size: a A A

Research On Prediction Method Of Diabetic Retinopathy And Data Visualization For Diabetes

Posted on:2022-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:Z ShenFull Text:PDF
GTID:2544306323471464Subject:digital media technology
Abstract/Summary:PDF Full Text Request
Through data mining,data visualization and other technologies,big data of healthcare is widely used in many fields.Nowadays,Diabetes is too significant to be ignored.Therefore,the dissertation uses diabetes clinical structured data to conduct research in two medical scenarios.First,new methods of feature selection and model fusion in the data mining are studied for the prediction of diabetic retinopathy.Second,a visualization is designed to optimize the effectiveness of using the data for highdimensional diabetes data analysis.This dissertation can be summarized as follows:1.In the diabetic retinopathy prediction,an Improved Backward Search Based on XGBoost(XGBIBS)is proposed to solve the feature redundancy problem in small highdimensional diabetes data.In order to provide more feature combinations and utilize the combination of multiple weak features,XGBIBS adds a cache subset to the sequence and constructs sequences with multiple XGBoost feature importance metrics.XGBIBS can improve the accuracy of various classifiers by about 3-8 percent and reduce the feature dimension by 38-65 percent,which is better than XGBSFS,SVM-RFE and GA.2.In the diabetic retinopathy prediction,a model fusion method based on base learners’ selection called Sel-Stacking(Select-Stacking)is proposed to further improve the performance of the model.To balance the accuracy and diversity of base learners and improve the effect of the meta-learner,Sel-Stacking determines the optimal combination of base learners based on Pearson correlation coefficient and retains the Label-Proba in base learners as the input matrix of the meta-learner.Compared with a variety of single learners,Sel-Stacking improves the accuracy by about 1-10 percent,promoting the model’s prediction effect significantly.3.An interactive diabetic patients’ data visualization called DP Vis is constructed,to solve the problems of data analysis tools in the high-dimensional diabetes data analysis such as poor medical adaptation and high technical threshold.DPVis analyzes diabetes data from four aspects.Besides,it incorporates data mining results with a variety of visual coding and interactions.Multiple case studies have demonstrated that DPVis can meet the need of four visualization tasks,and facilitate the interpretation of data and the analysis of disease conditions.
Keywords/Search Tags:XGBoost feature selection, Model fusion by Stacking, Disease prediction, Interactive visualization
PDF Full Text Request
Related items