With the improvement of medical treatment,patients with end-stage renal disease can be treated by kidney transplantation,but post-operative complications such as infection can easily occur,which can endanger the survival of the graft and the patient,so doctors need to arrange irregular and variable examinations for patients to closely observe their physical condition and then adjust the dosing regimen of immunosuppressive drugs in an appropriate amount.As the number of examinations increases,the electronic health record data also increases day by day,and the electronic health data are often imbalanced and irregular data.Since doctors need to combine the historical medical records to make a comprehensive diagnosis when diagnosing patients’ medical records,the huge amount of data,the intricate relationship between data,and the individual differences will make it more difficult for doctors to make a diagnosis,so it is practical to design a kidney transplant infection prediction model to accurately predict the infection after kidney transplantation Therefore,it is of practical significance to design a kidney transplantation infection prediction model to accurately predict post-transplantation infection to assist in diagnosis and treatment.However,most of the current research on kidney transplant infection prediction is focused on medical field,and the researched models have limitations such as unintelligent,based on small-scale data sets and nonreal data distribution.With the further integration of artificial intelligence and medical field,there have been good research results on predicting medical datasets with similar characteristics by machine learning methods,so machine learning techniques will become an important research direction for us.The purpose of this paper is to design kidney transplant infection prediction models based on traditional machine learning and deep learning,respectively,and to extend the research results on kidney transplant datasets to medical datasets with similar characteristics.The main work of this paper is as follows.1.Analysis of the kidney transplant dataset and multi-scenario modeling.The kidney transplant dataset used in this paper is a real collection of hospital dataset,which is characterized by imbalanced,irregular,multivariate,containing missing values,single-point prediction,and imbalanced distribution of temporal length,and there is no public dataset with identical characteristics at present.According to the treatment of temporal information in kidney transplantation dataset,this paper is divided into non-temporal and temporal scenarios for research,and kidney transplantation infection prediction models are established according to the characteristics of kidney transplantation dataset in different scenarios.2.A traditional machine learning-based kidney transplant infection prediction model for non-temporal scenarios and its generalization.Some comparison experiments of methods for imbalanced dataset,feature selection method,and traditional machine learning models are designed for the characteristics of imbalanced,multivariate,and containing missing values.The experimental results show that the balanced dataset can improve the prediction performance of the models,and the effect of feature selection is affected by the data distribution and data volume,among which the best stability and prediction performance is the logistic regression model without feature selection under the SMOTE method,with a recall rate of 75.76%and an F1 score of 9.98%.Finally,when the prediction framework based on traditional machine learning is extended to the Physionet2012 dataset with similar features,the best classification performance is the GBDT model with feature selection under the EasyEnsemble method,with a recall of 66.71%and an F1 score of 33.19%.3.A deep learning-based kidney transplant infection prediction model for temporal scenarios and its generalization.For the characteristics of imbalance,some comparison experiments of methods for imbalanced dataset were designed;for the characteristics of multivariate,containing missing values,irregular time interval and single-point prediction,the Multi-Time Attention Networks(MTAN)with optimized classification structure was proposed and some comparison experiments were conducted;for the characteristics of uneven distribution of temporal length,the MTAN model based on sliding window method was proposed and some comparison experiments were conducted.The experimental results show that methods for imbalanced dataset,optimizing the classification structure of MTAN model and the sliding window method can improve the infection prediction ability of MTAN model,and the recall rate of the best temporal kidney transplant infection prediction model can reach 77.50%and F1 score can reach 3.41%.Finally,when the deep learning-based prediction framework was extended to the Physionet2012 dataset with similar characteristics,the best classification performance was achieved by the MTAN model with a recall of 85.57%and an F1 score of 50.30%after optimizing the classification structure based on the sliding window method and the weighted cross-entropy loss function. |