Selective Ensemble Model And Its Application In Diabetes Prediction

Posted on:2021-01-30

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Li

Full Text:PDF

GTID:2494306458990809

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In recent years,ensemble learning has become a hotspot in the research of machine learning in medical data prediction.As an extension of integration learning,selective integration can reduce integration scale while maintaining high prediction accuracy.The key to the study of selective integration lies in its selection strategy,but the previous static selection did not fully consider the difference of samples to be tested.Therefore,this paper designs and establishes a dynamic selective ensemble prediction model to select the best set of base learners for each sample to be tested,so as to improve the prediction accuracy of regression and classification.Diabetes has become one of the chronic diseases threatening human health.The latest survey shows that China has the largest number of diabetes patients in the world,with a total number of 116 million.In addition,there are some patients whose quality of life is seriously affected by the lack of early treatment.Therefore,the selective ensemble model is used to realize the regression and classification prediction of diabetes,which provides a basis for early screening and prevention of diabetes.The main research content of this paper includes the following three aspects:(1)A new nearest neighbor similarity measure based on feature importance weighting is proposed.Due to the differences in the prediction performance of the learner for different samples to be tested,this paper evaluates the prediction accuracy of the learner by using the nearest neighbor samples of the samples to be tested.However,the existing nearest neighbor similarity measurement usually adopts the Euclidean distance method,which tends to lack attention to the importance of sample features.Based on this,this paper proposes a sample nearest neighbor similarity measure based on feature importance weighting by utilizing the advantages of random forest in evaluating feature importance,such as strong interpretability and less parameter adjustment.Experimental results show that the similarity measurement method improves the prediction accuracy of regression and classification.(2)A model called DSEP-KNNPAE(Dynamic Selective Ensemble Prediction Model Based on K-Nearest Neighbors Prediction Accuracy Evaluation)was designed and established.In this model,the nearest neighbor similarity measurement method based on the weight of feature importance is used to find the best nearest neighbor samples of the samples to be tested.The comparison experiment of different algorithms and the parameter sensitivity analysis experiment verify that the model established in this paper is superior to the existing integrated learning algorithm in the prediction accuracy of regression and classification.(3)DSEP-KNNPAE model was applied to predict the regression of blood glucose in diabetes and genetic risk classification of gestational diabetes,which was used to assist decision-making in early screening.Compared with the existing integrated learning algorithm,DSEP-KNNPAE has higher prediction accuracy in the prediction of diabetes,which effectively improves the screening effect of diabetes.

Keywords/Search Tags:

Dynamic selective ensemble, Nearest neighbor similarity measurement, Feature importance weighting, Diabetes prediction

PDF Full Text Request

Related items

1	Research On Computer Aided Design Of Pulmonary Nodules By Content-based Medical Image Retrieval
2	Research On Diabetes Decision-making Algorithm Based On Deep Learning
3	Application Of Ensemble Learning Algorithm In Diabetes Prediction
4	Research On Diabetes Prediction Model Based On Ensemble Learning
5	Risk Stratifications Prediction Of Gastrointestinal Stromal Tumors Based On Radiomics
6	Research On The Association Algorithm Between LncRNA And Disease Based On Multi-Layer Linear Projection
7	Research On Prediction Model Of Gestational Diabetes Mellitus Based On Ensemble Learning
8	Research On Application Of Support Vector Machine And K-Nearest Neighbor Learning In Cancer Classification Based On Multiclassification ROC Evaluation
9	Research On Prediction Model Of Gestational Diabetes Mellitus Based On Classifiers Ensemble
10	HIV Protease Cleavage Site Prediction Based On Feature Selection And Biological Similarity