Font Size: a A A

Disease Risk Assessment Based On Physical Examination Data

Posted on:2022-11-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:H YangFull Text:PDF
GTID:1484306764460254Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
The analysis and research of health and medical data bring unlimited opportunities and broad prospects for the development of the health care industry.It can promote the technological development in disease prediction,intelligent diagnosis and etiology positioning,and promote the transformation and reform of clinical scientific research,public health,people-friendly servises and industrial development.Big data from the real world contains the internal correlation of things.Applying new information technologies to the study of health care data can mine the relationship between different elements from complex and diverse data,which could help people grasp the main factors affecting diseases.With the change of living environment and the acceleration of the pace of life,people pay more and more attention to health physical examination.The examination information of a large number of physical examiners in physical examination institutions has been recorded,resulting in the explosive growth of physical examination data.It is very important for disease prevention to mine valuable information from massive physical examination data and clarify the change law of various chronic diseases.Risk early warning and health management for chronic patients is an effective means to delay the process of chronic diseases,control complications and reduce the disability rate.Disease risk assessment is the key technology to study the relationship between disease risk factors and disease morbidity and mortality,and is the core link of disease prevention and control.This dissertation used two sets of physical examination data in large-scale to mine the risk factors associated with diabetes and coronary heart disease.Subsequently,based on the risk factors,a risk assessment model and a risk score card for diabetes and coronary heart disease were constructed.By studying and analyzing the related factors of diabetes follow-up satisfaction,the health manangment strategy of diabetes and coronary heart disease were discussed.The data used in this study are retrospective data.Therefore,this study belongs to a retrospective case-control study.The main research contents of this dissertation are as follows:At first,based on the physical examination data of patients with diabetes mellitus(type I and II)and healthy persons from 2011 to 2017 in the electronic medical record system of Luzhou Municipal Health Commission,a diabetes risk assessment system was established.During the construction of the diabetes risk assessment system,three types of physical examination data were integrated: demographic information,vital signs and laboratory detection values.And detailed statistical analysis was carried out.Then,a set of diabetes risk factors including Age,Waist to Height Ratio(WHt R),Body Mass Index(BMI),Mean Systolic Pressure(MSP),Fasting Blood Glucose(FBG)and Urine Glucose(UGLU)were excavated by feature screening technology.Based on these six diabetes risk factors,a diabetes risk assessment model and adiabetes risk score card were established to assess the risk of diabetes at the system level and user level.The limit gradient lifting algorithm(XGBoost)was used to build a diabetes risk assessment model,which could produce AUC of 0.8763.Although this model can be used to assess the risk of diabetes in large scale systems,the model is not practical and lacks enough interpretability for users without machine learning background.Therefore,on the basis of this model,the diabetes risk score card is constructed by dividing each continuous variable into a box and combining LR with the scale algorithm of grading card.The AUC value of the prediction performance of the score card on the independent test set reaches0.8681,which is only slightly lower than the prediction performance of the prediction model,proving that there is almost no loss of information after feature sorting.The score card is very friendly to users and can provide a hint of diabetes risk.Using follow-up data from the same source of diabetes at the same time,the study further constructed a followup satisfaction model using XGBoost.When the FBG was included in the feature set,AUC was 0.9450.Once the FBG was excluded from the feature set,AUC was 0.7742.Further analysis on feature revealed that regular medication and emotional control were related factors of diabetes control.Further,based on the physical examination data of healthy people and coronary heart disease(CHD)patients in 128 township health centers and 18 community health service centers in 7 regions under Luzhou Municipal Health Commission from January 2018 to October 2020,a risk assessment system for CHD was established.During the construction of the CHD risk assessment system,a three-step feature screening scheme was designed,and 11 risk factors were screened from the original basic physical examination features,including Age,Gender,Waist-to-Height Ratio(WHt R),mean systolic blood pressure(MSP),symptoms,temperature,Dentition,electrocardiogram(ECG),Fasting Blood-Glucose(FBG),Platelet(PLT),Blood Urea Nitrogen(BUN).As previous studies have shown that there are significant differences in the incidence of CHD between men and women,the corresponding coronary heart disease risk assessment models and score cards for men and women were established.When establishing the CHD risk assessment model,the performance of the fully connected network(FCN),logistic regression(LR)and XGBoost were compared.And finally,due to its good performance,the FCN was chosen to establish a gender-specific risk assessment model for CHD.Then,by combining LR with scorecard calibration algorithm,a gender-specific CHD risk scorecard was established.The AUC of the male CHD risk assessment model on the independent test set and the external validation set were 0.8671 and 0.8659,respectively.For the female CHD risk assessment model,the AUCs of 0.8991 and 0.9006 were obtianed on the independent test set the external validation set,respectively.The male CHD risk score card could produce AUCs of 0.8668 and 0.8238 on the independent test set and the external validation set,respectively.Similarly,the female CHD risk score card could produce AUC of 0.8884 on independent test set and the AUC of 0.8519 on external validation set.In addition,by analyzing the Traditional Chinese Medicine constitution data of healthy people and CHD patients from the same source at the same time,it was found that neutral constitution was the most important protective factor of CHD,while Phlegm-dampness constitution was the most important risk factor of CHD.Finally,by comparing the living habits of people with CHD and those of healthy people,it is found that people with CHD generally consciously and actively improve their life style.In summary,based on large-scale physical examinations data,this dissertation discusses the use of physical examination big data in to mine chronic disease risk factors,and builts a cascaded chronic disease risk assessment system.Two online service platforms were established to provided guidance for early diagnosis and treatment of chronic diseases.
Keywords/Search Tags:Physical Examination Data, Machine Learning, Diabetes, Coronary Heart Disease, Risk Assessment, Score Card
PDF Full Text Request
Related items