| In recent years,with the rapid development and wide application of computer technology,more and more artificial intelligence and machine learning technologies have been utilized for the auxiliary diagnosis of medical diseases.With the help of machine learning(ML)techniques,the possible errors made by the pathologists and physicians,such as those caused by inexperience,fatigue,stress and so on can be avoided and the clinical decision can be made more efficient and accurate.Motivated by the above consideration,in this thesis we take data-driven medical intelligent decision-making as the research background,in which we introduce the imbalanced medical datasets as input and then construct the different intelligent classification models for medical disease diagnosis and decision support.The main findings and contributions are as follows:Firstly,this thesis proposed an improved cost-sensitive support vector machine classification algorithm based on static imbalanced datasets.The proposed algorithm utilized IG algorithm to select features,adopted the SAPSO intelligent optimization algorithm to optimize its internal parameters taken into the randomness of internal kernel function parameters and penalty parameters of SVM(RBF)classifier,and applied the embedded CSSVM classification model for medical diseases intelligent diagnosis.The experimental results verified that the improved cost-sensitive classification model can achieve the best performances than other state-of-the-art classification models.Secondly,this thesis proposed an intelligent classification model for intelligent diagnosis of medical diseases based on static imbalanced datasets.This algorithm utilized synthetic minority oversampling technique combined cross validation committee filtering strategy to process the imbalanced input data,and built an ensemble SVM classification model which is composed of ten different structures SVM classifiers and has strong generalization performance and classification accuracy,and then we applied weight fusion strategy to fusion the final results,herein we introduced simulated annealing genetic algorithm to optimize the weight.Finally,the experimental results verify the effectiveness of the proposed ensemble learning classification model.Next,this thesis proposed a hybrid intelligent classification model to predict the physiological state based on dynamic imbalanced datasets.This model used a sliding time window to collect physiological data and utilized RCMSE method to extract the features of temporal physiological data,reconstructed the coarse-grained feature space and introduced the SMOTE+Tomek links sampling strategy to process the imbalanced data,then proposed the CR-MIFS method to select the coarse-grained feature,finally a novel hybrid model was proposed to predict the physiological status.The final experimental results verified the effectiveness of our proposed hybrid intelligent classification model.Finally,this thesis proposed a two-stage learning model to classify the five-year survival status of cancer prognosis and to predict the specific survival time for the cancer patients based on dynamic imbalanced datasets.The model used a fixed time window to collect data in advanced-stage of cancer.In the first stage,we applied the resampling strategy to deal with imbalanced data and used GBDT+LR classification model to classify the five-year survival status of cancer prognosis,then constructed the prediction model of GBDT+LSTM to predict the specific survival time for cancer patients based on the negative(survival time is no more than 60 months)classification results of the first stage.The final experimental results verified that the proposed first-stage classification model can achieve the best classification accuracy in the five-year survivability prediction for the cancer patients than other state-of-the-art models,and the proposed second-stage survival rate prediction model can also achieve the minimum RMSE and MAE.The proposed two-stage learning proposed can perform well in the clinical decision making. |