Font Size: a A A

Application Of Machine Learning Algorithms Based On Electronic Medical Records In Cardiovascular Disease Prediction

Posted on:2021-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z DuFull Text:PDF
GTID:2404330629451053Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,the continuous improvement and widespread application of medical information systems have led to the explosive growth of biomedical data,such as medical imaging,electronic medical records,biometric markers,clinical registration records have potential and important research value.However,medical research based on statistics methods is limited by the category and size of research community,so it cannot effectively perform data mining for large-scale medical information,while machine learning technologies oriented to big data can effectively solve this problem.Therefore,this paper introduces machine learning technology,omni-directionally and multi-angularly integrates data information in electronic medical records,in-depth analysis and mines useful clinical characteristics,and establishes a cardiovascular disease risk prediction model for hypertension patients,providing a powerful reference for clinical diagnosis.The main research contents of this paper are as follows:(1)Based on the electronic medical records of hundreds of thousands of hypertensive patients collected by the Shenzhen Health Information Platform,after screening of the patient population,data preprocessing,construction of characteristic variables,and corresponding statistical analysis,the XGBoost algorithm was used to train a coronary heart disease prediction model composed of 53 characteristic variables,with the AUC of 0.967 and the accuracy of 0.918.(2)By comparing this model with the traditional Framingham model,it is proved that the risk prediction model created based on big data and machine learning algorithm has higher prediction accuracy and better performance than the traditional statistical models.Later,through univariate variable analysis experiments,it was found that both the traditional risk factors and the clinical features extracted from electronic health records showed a high degree of non-linear correlation in predicting the risk probability of disease occurrence.(3)In order to enhance the interpretability of the model,comparison studies were also performed which reflected the effects of training sample size,factor sets and modeling approaches on the prediction performances.
Keywords/Search Tags:Machine learning, XGBoost, Disease prediction, Electronic medical records
PDF Full Text Request
Related items