Objective: To investigate whether the combination of radiomics and automatic machine learning-based classification of original images from multiphase DCE-MRI can predict prostate cancer(PCa)aggressiveness.Methods: From January 2016 to May 2018,forty biopsy-confirmed PCa patients were included from our hospital.Biopsy was performed within 4 weeks after the DCE-MRI examinations.Clinical and imaging data from each patient were collected.According to the time signal intensity curve,lesion segmentation was performed on the first and on the strongest phase of tumor enhancement on the original DCE-MR images,and 1029 quantitative radiomics features were automatically calculated from each lesion(These features can be grouped into four categories.Category 1 features(first order statistics)quantitatively delineate the distribution of voxel intensities within the MR images through commonly used and basic metrics,such as mean,entropy,etc.Category 2 features(shape-and size-based features)reflect the shape and size of the regions.Textural features calculated from the Gray-level run-length matrix(GLRLM),Gray-level size zone matrix(GLSZM)and Gray-level co-occurrence matrix(GLCM)quantify differences in the heterogeneity of regions and are classified as category 3(texture features).Finally,category 4(higher order statistics features)includes intensity and texture features derived from image transformation of the original image.The filters of the image transformation are as follows: exponential,square,square root,logarithm and wavelet),wherein there were three datasets available Dataset-F(Feature extracted from the first phase of tumor enhancement),Dataset-S(Feature extracted from the strongest phase of tumor enhancement)and Dataset-FS(Feature extracted from the both of the two phase of tumor enhancement).The variance threshold method,select k-best method and least absolute shrinkage and selection operator(LASSO)algorithm were used to reduce the feature dimensions and the optimal feature subsets for each dataset were filtered out.Five machine learning approaches-Logistic regression(LR),Random forests(RF),Decision tree(DT),K-nearest neighbor(KNN),and Support vector machine(SVM)-leveraging cross-validation were employed,and the clinical value of each model was evaluated by area under the curve(AUC).Correlation analysis was performed between the features of the machine learning model that achieved the best classification performance and the Gleason score(GS)of the PCa lesion.Results: 8,4 and 16 features were selected as optimal subsets in Dataset-F,-S and-FS,respectively.The features of optimal subsets in Dataset-F were as follows: F-Least Axis-shape,F-Median-wavelet-HLH,F-Mean-wavelet-HLH,F-Large area emphasis(LAE)in GLSZM-square,F-Long Run Emphasis(LRE)in GLRLM-wavelet-HHH,F-Run Length Non Uniformity(RLN)in GLRLM-exponential,F-Total Energy-square root and F-Total Energy-first order statistics.The features of optimal subsets in Dataset-S are as follows: S-Least Axis-shape,S-Large area high gray-level emphasis(LAHGLE)in GLSZM-texture features,S-Median-wavelet-HHL and S-Mean-wavelet-HHL.The features of optimal subsets in Dataset-FS are as follows: F-Least Axis-shape,S-Least Axis-shape,F-Total Energy-first order statistics,F-Total Energy-logarithm,S-LAE in GLSZM-wavelet-LLH,S-LAHGLE in GLSZM-texture features,F-Zone entropy(ZE)in GLSZM-wavelet-HHL,F-LRE in GLRLM-wavelet-HHH,F-RLN in GLRLM-exponential,S-LRE in GLRLM-wavelet-HHH,F-Median-wavelet-HLH,F-LAE in GLSZM-square,S-Zone Variance(ZV)in GLSZM-texture features and S-Mean-wavelet-HHL.Predication efficacy of each machine learning model based on Dataset-F were as follows: LR(AUC = 0.87),RF(AUC = 0.83),DT(AUC = 0.71),KNN(AUC = 0.88)and SVM(AUC = 0.84).Predication efficacy of each machine learning model based on Dataset-S were as follows: LR(AUC = 0.84),RF(AUC = 0.80),DT(AUC = 0.69),KNN(AUC = 0.82)and SVM(AUC = 0.83).Predication efficacy of each machine learning model based on Dataset-FS were as follows: LR(AUC = 0.93),RF(AUC = 0.82),DT(AUC = 0.77),KNN(AUC = 0.91)and SVM(AUC = 0.90).Among all three datasets,LR-based analysis with Dataset-FS had the highest predication efficacy(AUC = 0.93).Ten features in Dataset-FS(F-Least Axis-shape,S-Least Axis-shape,F-Total Energy-first order statistics,F-Total Energy-logarithm,S-LAE in GLSZM-wavelet-LLH,S-LAHGLE in GLSZM-texture features,F-ZE in GLSZM-wavelet-HHL,F-LRE in GLRLM-wavelet-HHH,F-RLN in GLRLM-exponential and S-LRE in GLRLM-wavelet-HHH)showed significantly positively correlation with GS.The model performance of Dataset-F was generally better than that in Dataset-S.Conclusion: Combination of radiomics and machine learning-analysis based analysis of the union of the first and strongest phases of tumor of original DCE-MR images can predict PCa aggressiveness noninvasively,accurately and automatically. |