| The medical diagnosis field is currently faced with several critical problems,such as insufficient medical resources,increasing number of patients,and inefficient diagnostic process.Computer-aided breast cancer image recognition model can alleviate the above problems to some extent.However,existing studies usually use a single image feature to perform breast cancer image recognition instead of cross-modal pathological semantics,the complementarity in the the final decision and the ensemble nature of different predictions.To address the above problems,a novel model for breast cancer image recognition using multi-stage multi-feature deep fusion methods is proposed,which is an organic integration of multi-feature fusion methods(early-,mid-,and late-fusions).The main work is shown as follows:(1)A breast cancer image recognition model based on the traditional classifier:Seven image features,namely,SIFT,HOG,Gist,LBP,VGG,Res Net and Dense Net,are extracted respectively to characterize mammograms from diverse perspectives,including shape,texture,and deep learning.The probability score of each image feature is computed by the SVM,Adaboost and other classifiers.And diagnosis results are made according to probability scores.Experimental results demonstrate that the recognition model using the Ada Boost algorithm is superior to the models using the SVM,or XGBoost or other classifiers.It can obtain better Acc and AUC values in the CBIS-DDSM dataset,but the over-fitting problem occurs in the INbreast dataset.If we decide to visualize each image feature.Very obvious overlaps between the negative and positive samples can be observed.This means it is difficult for single image feature to distinguish different kinds of samples.Therefore,the recognition performance of single image feature needs to be improved.The complementarity among different image features should be utilized to improve the final recognition performance.(2)A breast cancer image recognition model based on a modified ERGS algorithm:First,on the basis of the above-mentioned seven image features,the prior probability of each feature is computed by the SVM,or Adaboost or other algorithms..Second,the traditional ERGS algorithm is modified to complete feature mid-fusion,which can dynamically calculate the ERGS weight of each image feature.And the final predictions are made by weighting the prior probability based on the ERGS weight.Experimental results demonstrate that the complementarity among different image features has been utilized by the modified ERGS algorithm,and thus effectively suppress the noise caused by the variations of shape,angle and illumination of breast mass in lesion areas.The final recognition performance is improved.For an imbalanced dataset like INbreast,the modified ERGS algorithm cannot alleviate the over-fitting problem and the overall recognition performance needs to be further improved.Hence the cross-modal pathological semantics among different image features and the ensemble nature of different predictions should be fully utilized to address the above problem.(3)A novel breast cancer image recognition model based on multi-stage multi-feature deep fusion:First,on the basis of the seven image features,the cross-modal pathological semantics among different image features are deeply mined and feature early-fusion is implemented in turn.Second,the prior probability of each image feature(or cross-modal pathological semantics)is computed by the Adaboost algorithm.Third,a modified ERGS~*(Efficient Range-based Gene Selection)algorithm is proposed to dynamically calculate the corresponding weight of each image feature(or cross-modal pathological semantics)and complete feature mid-fusion.Finally,a group of feature combinations are selected for a voting-based ensemble learning and the feature late-fusion is completed.Experimental results demonstrate that the over-fitting problem is effectively suppressed to a certain degree by utilizing the cross-modal pathological semantics.If we decide to visualize each image feature.No obvious overlaps between the negative and positive samples can be observed..It is easier for the cross-modal pathological semantics to distinguish different kinds of samples.Moreover,the modified ERGS~*algorithm and voting-based ensemble learning strategies can improve recognition performance too.In addition,the proposed model has a very low false positive rate(FPR),demonstrating higher practical value. |