Font Size: a A A

Research On Risk Prediction Of Type 2 Diabetes Mellitus Based On Data Mining Technology

Posted on:2018-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y N ZhuFull Text:PDF
GTID:2334330533963332Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Diabetes mellitus has become the third major threat to human health after cardiovascular and cerebrovascular diseases and malignant tumors.The number of diabetes in China is growing at alarming rates with the improvement of people's living standard and the accelerating rhythm of life and diabetes have a trend towards younger age.Newest investigation shows there are almost 1.14 hundred million diabetes in China which has the states of high incidence rate,low awareness rate,low treatment rate and low standard rate.Patients who do not get timely treatment and control may have complications such as cardiovascular and cerebrovascular disease and diabetic foot,which not only seriously affect the quality of life of patients with diabetes,but also bring heavy burden to family and society.Therefore,to prevent and control the occurrence of type 2diabetes is of great strategic significance to save medical resources and reduce medical expenditure in China.This paper is based on the theory of data mining classification and classifier evaluation.First of all,the original data has been collected from Qinhuangdao municipal hospital,and the relevant data preprocessing technology was used to clean the data;Secondly,aiming at the limitation of single classifier,we compared the advantages and disadvantages of multiple classifiers(including decision tree C5.0,artificial neural network and support vector machine);After that,we used several evaluation tools to predict the performance of the model and evaluate the quality of the model,and then we will get the best classifier which can be used to predict the risk of type 2 diabetes;Third,from the point of view of the operability and practicability of the model,we used the decision tree C5.0 algorithm to establish the prediction models in different data set,which provided help for early warning and intervention of type 2 diabetes mellitus;Finally,due to complexity of medical data and the high requirement of classification accuracy and algorithm stability of clinical decision making in medical treatment,we use Weka and Eclipse software to build an ensemble classifier to enhance the robustness of the model.Therefore,a data mining model with good stability,fast learning speed and bestclassification result will be established.The evaluation and comparison of the model showed that if we used a single classifier,the prediction accuracy,sensitivity,specificity,Jorden index and the area under the ROC curve of the decision tree C5.0 model were the highest under the complicated clinical conditions.This showed that the application of a decision tree C5.0 model in predicting the risk of type 2 diabetes mellitus is the most suitable.It has a certain guiding role in the prevention and clinical diagnosis of diabetic high-risk groups,and has reference value.However,due to the limitation of the classical algorithm,we proposed an ensemble learning algorithm.We found that using the Bagging ensemble algorithm to combine multiple C4.5 single classifier model has good stability,fast learning speed,strong generalization ability,and the best classification effect in the complex clinical data sets.
Keywords/Search Tags:type 2 diabetes mellitus, data mining, risk prediction, ensemble learning
PDF Full Text Request
Related items