In recent years,with the development of health care informationization,great progress has been made in the construction of health care informationization in China.Scientific research,health care services and management practice have made a large amount of electronic medical record data accumulated in the information system.For most hospitals,electronic medical record system is more a management tool,and medical data resources are not fully utilized.It is more urgent than ever for data-driven mining methods to intelligently transform all available information into valuable knowledge in biomedical applications.Therefore,the mining of huge data in electronic medical record system has become a major trend in medical information research.Diabetes has become the third major chronic disease threatening human health after cardiovascular disease and malignant tumor.Diabetes mellitus is easy to cause complications,among which diabetic nephropathy is one of the most important complications of diabetic patients.Complications are difficult to detect in the early stage and difficult to be cured by drugs after complications.Therefore,the prediction of complications has become a research hotspot.The subject selected data sources related to diabetes mellitus complicated with nephropathy in electronic medical records of a hospital for this research.The research on diabetic complications mainly includes the following aspects:(1)data preprocessing.Due to improper operation and machine failure,there are some data quality problems such as artificial noise,missing values,outliers and so on.In order not to affect the experimental prediction results,a series of pre-processing operations such as data extraction,integration and cleaning were carried out in the study.(2)Establishment of predictive model for diabetic complications.Multidimensional analysis of the pretreated data was carried out,and Random Forest(RF)was selected as the basic algorithm for the model construction based on the advantages and disadvantages of the algorithm.An Empirical Study on the model construction of diabetic nephropathy was carried out.In order to solve the problem that the core parameters of the algorithm are difficult to determine,a random forest algorithm optimized by grid search(RFGS)and a random forest algorithm optimized by genetic algorithm(RFGA)are designed to predict diabetic nephropathy,and K-nearest method is used to predict diabetic nephropathy.Neighbor algorithm,logistic regression algorithm,support vector machine algorithm and non-optimized stochastic forest algorithm were compared and tested.Through the analysis of evaluation index of experimental results,the prediction effect of the model adopted in this paper is better than other models,which further illustrates the validity and accuracy of the model in this paper.(3)System design and implementation.This paper uses Java,an object-oriented programming language,to systematically implement the research process and results of diabetes complication prediction.In the aspect of system design,the overall functions of the system and the functions of each sub-module are explained,and the database design of the system is given.In the aspect of system implementation,the realization class diagram of each module is designed and explained,and the system interface is displayed according to the prediction of diabetic nephropathy,which proves the practicability and reliability of the system.In general,the diabetes complication prediction system relies on the "digital medical joint laboratory",makes an attempt to implement the medical assistant system and verify the reliability and usability of the system. |