| Prostate cancer is one of the most prevalent cancers in the world today.Pathological Gleason classification system is currently the most widely used classification system for prostate cancer.Gleason classification is one of the most important reference factor for the choice of treatment options and prognosis of patients.Nowadays,pathological Gleason score is obtained by biopsy surgery.If we can use computer-assisted diagnosis method to predict the Gleason classification of patients based on prostate magnetic resonance imaging MRI,and the degree of pathological changes in patients with high and low risk classification,then the need for biopsy surgery can be reduced,and patients’psychological and physiological pressures will bring great relief,and avoid complications such as infection and bleeding,which will prevent the induction of worsening of cancerous lesions.This study is based on the radiomics research framework.After having the image data,sketching the ROI and preprocessing of the data,high-throughput features were extracted to map the tumor area information to a high-dimensional feature space.This method removed irrelevant information and redundant information,reduced the feature space dimension and increased the information density,effectively avoided the problem of dimensional disaster and could improve the accuracy of the model.Then the reduced-dimensional features were put into the machine learning and deep learning models to predict the high and low risk classification of prostate cancer Gleason pathology,thereby assisting physicians in the selection of subsequent treatment options for patients in a non-invasive manner.The main work of this thesis has the following five aspects:1.This article briefly explained the medical knowledge of prostate cancer and MRI images firstly in order to understand and lead to research work.This article also summarized domestic and foreign diagnostic methods of prostate cancer using radiomics,analyzed the existing problems and deficiencies in order to establish a more suitable and better model.2.Used radiomics to extract image features.In this paper,high-throughput feature extraction was performed on 316 MRI images of prostate cancer and areas of interest outlined by physicians,so as to establish a predictive model for quantitative analysis.After discretizing the grayscale histogram of the image,a total of 92 texture features in 6 categories and 13shape features focused on geometric description were extracted.These features can make effective use of valuable medical image information to build a more effective prediction model.3.The extracted features were selected and reduced in order to filter out some irrelevant features and redundant features to avoid dimensional disaster problems and overfitting problems.In this paper,the Spearman correlation analysis method was used to calculate the Spearman correlation coefficient between the 105-dimensional imaging omics features and the high and low risk categories of the pathology,and to screen out 29-dimensional features that are not significantly related.Before putting the features into the artificial neural network training procedure,the PCA dimensionality reduction method was used,while retaining 99%of the original feature information,a 21-dimensional new feature space was obtained,which increased the information density.4.Established a classification model based on machine learning.This paper used three machine learning algorithm models:SVM algorithm,random forest algorithm and Xgboost algorithm to realize the classification and prediction of Gleason pathological risk of prostate cancer.The validation model used was a 10-fold cross validation with 100 data shuffles.The results show that the Xgboost algorithm has the best performance on the T2WI sequence for the classification of prostate cancer Gleason high and low risk.The performance has an AUC of 0.719 and an accuracy rate of 0.722.The pathological classification prediction effect of b-value=2000 s/mm~2 in DWI sequence is slightly better than that of DWI sequence with b-value equal to 1000 or 3000.5.A classification model based on the artificial neural network and Bagging integrated learning algorithm was established.This study built a three-layer feedforward neural network as an individual learner for ensemble learning,used the Bagging algorithm as a combination strategy for ensemble learning,and determined the classification result of the ensemble learning strong learner by voting in the end.After 10-fold cross-validation of 100data shuffles,the classification prediction model performed best on the T2WI sequence,with an AUC of 0.759 and an ACC of 0.718.Box chart was performed on the AUC values of 1000training results to prove the necessity of cross-validation strategy.In general,four algorithm models based on the radiomics framework are used to predict the high and low risk classification of prostate cancer Gleason pathology.Among them,the classification model based on the artificial neural network and Bagging integrated learning algorithm has the best effect.The T2WI sequence has reached the level of AUC=0.759,ACC=0.718,and the performance is slightly higher than the DWI sequences of three b-value.The cross-validation strategy used by the algorithm guarantees the stability of the results. |