Research On Optimization Of Hypertension Risk Prediction Based On Machine Learning

Posted on:2024-09-25

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Wu

Full Text:PDF

GTID:2544306932460444

Subject:Electronic information

Abstract/Summary:

PDF Full Text Request

Nowadays,big data and machine learning have occupied an important position in many fields,and the medical industry has made great progress and development in the collection,processing,analysis and application of data.It is of great significance to use medical data to build predictive models and provide certain auxiliary functions in medical decision-making.Hypertension is one of the most difficult chronic diseases to treat worldwide,it is considered to be the largest contributor to the global burden of disease prevention,and it is also a major contributor to cardiovascular disease.The prediction of the risk of hypertension can help patients achieve prevention and effective treatment before or in the early stage of the disease,and prevent various complications caused by the deterioration of the disease that are lifethreatening.Due to the high dimensionality,intractability,and imbalance of hypertension data,it is difficult to provide accurate predictions using a simple prediction model.Therefore,this paper proposes a study on the optimization of hypertension risk based on machine learning,using two optimization strategies,to improve the random forest model.The main research content of this paper is as follows:First,the hypertension data in NHANES database and CHARLS database were analyzed.The source and related attribute characteristics of hypertension data are introduced in detail,and then the raw data of hypertension are preprocessed,including data visualization,data cleaning and data normalization,etc.Next,to solve the problem of hypertension data imbalance,use the CURE-SMOTE algorithm balances the data,and finally uses traditional machine learning algorithms to establish a predictive model.Secondly,the EDE-RF hypertension risk prediction model was established.The random forest model is composed of multiple decision trees,and different numbers of decision trees will lead to very different performances of the model;a suitable subset of feature attributes can prevent over-fitting of the model,increase the diversity between classification trees,and reduce the difference between trees.The correlation between them can also improve the prediction quality of the random forest model;in addition,the depth of the decision tree in the random forest will also affect the fitting degree of the model.Therefore,selecting the optimal parameter combination can improve the classification accuracy of the model.This paper proposes a differential evolution algorithm based on the elite retention strategy,which is combined with the random forest model to construct the EDE-RF prediction model.The model seeks the optimal combination of parameters by optimizing the three parameters of the random forest algorithm,the number of decision trees,the subset of feature attributes and the maximum depth.Then compared with other optimization algorithms,the results show that the method proposed in this paper can accurately and quickly find the optimal solution of model parameters,and effectively improve the performance of the prediction model.Finally,an improved EDE-IRF hypertension risk prediction model was constructed.Aiming at the problem that the EDE-IRF model will generate decision trees with low accuracy and high similarity in the process of parameter optimization,a screening scheme for decision trees is proposed.Screening is performed according to the classification accuracy of the decision tree and the similarity between the decision trees,and the decision trees that meet the criteria are recombined to form a new random forest model.This method can improve the difference between decision trees and retain trees with higher accuracy to achieve the purpose of model optimization.Accuracy,Sensitivity,Precision,F-Measure and ROC curve to evaluate the experimental results.The experimental results show that the improved EDE-IRF model has a significant improvement in multiple indicators,can more accurately predict the risk of hypertension,and has a high reference value in practical applications.

Keywords/Search Tags:

Hypertension risk prediction, Machine learning, Imbalanced data, Differential evolution algorithm, Improved random forest

PDF Full Text Request

Related items

1	Construction And Evaluation Of Antenatal Depression Risk Prediction Model Based On Random Forest Algorithm
2	Machine-learning-based Prediction Of Stroke Risk Among Middle-aged And Elderly Chinese With Imbalanced Data
3	Research On The Establishment Of Prediction Models Of IVF-ET Treatment Outcomes And Analysis Of Prediction Characteristics Based On Random Forest Algorithm
4	Study On Drug Recommendation Based On Improved Random Forest Model
5	Research On Prediction Algorithm Of Thrombosis Risk Based On Imbalanced Data
6	Research On Application Of Imbalanced Learning Technology In Medical Data
7	Epileptic Seizure Prediction Algorithm Based On EEG Signals
8	Breast Cancer Risk Prediction Based On Apache Spark
9	Research On Risk Prediction Of Diabetes Based On Random Forest And Support Vector
10	Prediction Of CYP450 Inhibitor Based On Machine Learning And Imbalanced Data Sampling Techniques