Font Size: a A A

Research And Application Of Chronic Disease Prediction Based On Data Mining

Posted on:2022-05-25Degree:MasterType:Thesis
Country:ChinaCandidate:X HuangFull Text:PDF
GTID:2510306530480304Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the increasingly prominent problem of chronic diseases in our country,the government is paying more and more attention to the prevention of chronic diseases.Implement the strategy of moving forward and shifting focus downward.Starting from the upstream of disease occurrence,effective control and management of risk factors for disease occurrence are implemented.Constructing a chronic disease risk prediction model can effectively discover the potential risks and trends of the disease,so as to manage,prevent and intervene the disease.In view of the fact that a patient may have multiple diseases at the same time in real life,this paper builds a multi-disease risk prediction model based on deep learning to make the most comprehensive and accurate judgment as far as possible during the prediction.Multi-disease risk prediction requires us to select the major risk factors from many risk factors and use fewer features to achieve satisfactory prediction results.This study proposes a feature selection method from the perspective of data mining to analyze the main risk factors of chronic diseases.By comparing three tree model-based integrated learning algorithms RF,GBDT and XGBoost.Choose XGBoost,which has the best classification effect and faster running speed,for feature selection.After tuning the hyperparameters of the model by random search,use the average gain to measure the importance of features to filter out the most important features of the model.It is basically the same to use common chronic diseases for verification.And other major risk factors were selected.The article uses question conversion to convert the multi-disease prediction problem into a multi-label classification problem.And use the Label Powerset in the problem conversion to convert the multi-label data set into a multi-class data set.According to the one-dimensional characteristics of the samples in the two-dimensional medical data set,a one-dimensional convolutional neural network is selected to build the model.Modeling the 21,32,42,62 features of three chronic diseases through feature selection.The model constructed with the 32 features with the highest prediction accuracy is retained and loaded into the chronic disease risk prediction system based on the Django framework,enabling users to predict diseases from the front end.In view of the widespread data imbalance in the current medical data sets,this paper applies the focus loss function to the disease prediction model.Although the accuracy of the nondiseased category is sacrificed,it has a certain meaning to improve the prediction accuracy of the categories we pay more attention to.
Keywords/Search Tags:Feature selection, XGBoost, Multi-label classification, 1D-CNN, Django, Focus loss function
PDF Full Text Request
Related items