Font Size: a A A

Research On The Prediction Model Of Hypertension Risk Based On Electronic Medical Record Data

Posted on:2022-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:W B MaFull Text:PDF
GTID:2514306566491184Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As one of the chronic diseases with the highest prevalence,hypertension is also the chief culprit of cardiovascular and cerebrovascular diseases.Through a sample survey of hypertensive patients across the country in recent decades,the prevalence of hypertension in the low-age population is increasing year by year.Up to now,there is no method or drug that can completely cure high blood pressure in the medical field.At present,the treatment of hypertension is mainly to keep blood pressure within the normal range through drugs to avoid complications due to high blood pressure.The most important thing for hypertensive patients is to intervene in their lifestyles.Therefore,it is necessary to detect hypertension as soon as possible and carry out medical interventions early.From the perspective of electronic medical record data,with the purpose of building a hypertension risk prediction model,this study proposes a hypertension prediction model based on four consecutive years of physical examination data.This model uses the physical examination data of the previous three years to predict the fourth Annual hypertension prevalence probability,using a large number of high-latitude physical examination data as the basis for the construction of the model,and performing physical examination item selection,data cleaning and conversion,data standardization and feature construction on the unprocessed raw data.Secondly,PCA principal component analysis algorithm and SBS sequence backward selection algorithm are used to select the best feature subset.At the same time,the problem of medical data imbalance is studied,and the integrated algorithm of SMOTE+ENN is used to optimize the data imbalance in the data.Question,finally use decision tree,random forest,support vector machine,logistic regression,naive Bayes five classification models to study the construction of hypertension prediction model.The main research work of this paper:(1)Under the guidance of medical data research theory,explain and analyze the relevant steps,including data preprocessing,physical examination item selection,model construction and evaluation links,find out the existing problems,and find out the corresponding solutions based on these problems Ways to conduct research around hypertension and establish a hypertension prediction model.(2)In the feature selection stage,PCA principal component analysis and SBS sequence backward selection algorithm are used to select feature items.At the same time,for the medical data imbalance problem,the SMOTE+ENN integrated algorithm is proposed to optimize the data.(3)Carry out empirical research based on the strategy of constructing hypertensive prevalence prediction model,construct the prevalence prediction model of hypertension,and use decision tree,random forest,support vector machine,naive Bayes,and logistic regression algorithms for different feature selection methods.A control experiment was conducted on the data,and the significance of the difference between the five algorithms was tested.Finally,the random forest model using the SBS sequence backward selection algorithm had the best comprehensive index,with an accuracy rate of 77.44%,a sensitivity of 84.10%,and a specificity of 75.95%.,AUC value is 0.875.
Keywords/Search Tags:machine learning, physical examination data, feature selection, hypertension prediction
PDF Full Text Request
Related items