The Radiomics theory quantifies tumor interest by high-throughput extraction of a large number of image features,and uses the machine learning method to achieve graded tumor prediction.The imaging histology feature is calculated based on the region of interest,and its segmentation accuracy has an important influence on the implementation of the imaging group.At present,the automatic segmentation algorithm and the semi-automatic segmentation algorithm are not yet mature.The region of interest mainly depends on the manual division by the Radiologist.The workload is large and the repeatability is low,which hinders the clinical application of Radiomics;High-throughput produces a large number of features.When training samples are small,it is easy to produce over-fitting problems.However,medical data is difficult to obtain a large number of high-quality samples because of its particularity,and it is difficult to obtain high-precision prediction models.Therefore,it is necessary to select features.Segmentation and feature selection of the region of interest are two major issues faced in the implementation of Radiomics.Glioma is a common intracranial tumor with extremely high lethality and morbidity.The accurate classification of gliomas according to the World Health Organization(WHO)grading standards is very important for the determination of glioma treatment protocols and the adjustment and evaluation of the treatment process.In this paper,the grading results of gliomas were used to verify the performance of the stability feature selection method.In this dissertation,firstly,for the segmentation accuracy of the region of interest,a stability feature selection method that is insensitive to the tumor boundary is proposed.According to two imaging specialists,the regions of interest were manually segmented and the imagery features such as shape,density,texture,and wavelet were calculated.Based on the mutual information and Pearson correlation coefficient and other evaluation indicators,the correlation between the two groups of features was evaluated,and the boundary-insensitive features were selected through the threshold.The experimental results show that the selected features keep the prediction information while reducing the requirements of the model on the boundary segmentation accuracy and improve the adaptability of the model.The candidate feature set after the above screening still has a high dimension.This paper proposes a de-redundancy algorithm that combines mutual information,Pearson correlation coefficient and K-means clustering algorithm,and further reduces the feature dimension by removing redundant features.Improve the quality of features.This method finally selected eleven features that were insensitive to the tumor boundary and had better prediction performance(R>0.8).Through the use of logistic regression,Random Forest,KNN,and Support Vector Machine(SVM)models have higher than 80% training accuracy based on selected features,the features selected by the algorithm have better independence and differentiation. |