Objective:Breast cancer is a serious threat to human survival and social development of major diseases and one of the serious public health problems,causing a great burden on the society,the early screening of breast cancer has become the health strategy focus of governments all over the world.Mammography has become one of the commonly used methods for breast cancer screening because of its advantages of simplicity,effectiveness and economy.A large number of mammography images are generated in the early screening process of breast cancer,so the rapid and accurate classification of mammography images is very important.Deep learning has great advantages and potential in the field of visual processing,and how to use deep learning to help clinicians make diagnosis quickly and accurately is one of the research hotspots.Researchers have proposed many innovative network structures and models for this,but most of the research is limited to the overall evaluation of model performance,ignoring the evaluation of the judgment results of a single image.However,in the process of clinical diagnosis and treatment,only a few breast images are generated in a single patient.When the existing model determines the category of breast images,the reliability of making such a judgment cannot be given,which affects its clinical application.Therefore,this study intends to use Grad-CAM and Mask R-CNN technologies to analyze the reliability of benign and malignant classification results of single mammogram images,in order to promote the application of deep learning in the medical field.Methods:Breast mammography images and baseline data were downloaded from DDSM database,and the format of breast mammography images were converted,screened according to inclusion and exclusion criteria,and then divided into benign group,malignant group and normal group.At the same time,mammography images with good image quality,clear demarcation of the mass region and meeting the requirements were selected.Two experienced imaging physicians were involved in the labeling of the mass region of the mammography image,and the Mask images of the mammography mass were output.First,the baseline data of the three groups were statistically analyzed to compare the differences in age and breast tissue density among the three groups.Secondly,convolutional neural network was used to establish the benign and malignant classification model of breast mammography images.Secondly,the image cutting algorithm is used to reduce the background area and retain most of the breast images.The benign and malignant classification model of breast mammography images after artificial cutting is established based on convolutional neural network.Finally,the semantic segmentation model of breast is established by using the normal group based on Mask R-CNN,and the benign and malignant classification model of breast mammography images after semantic segmentation is established after removing the background noise of the original images of the benign and malignant groups.In addition,the semantic segmentation model of breast mass is established by using the benign and malignant groups based on Mask R-CNN,and the tumor region is automatically labeled.Grad-CAM algorithm was used to output gradient-weighted class activation heat maps in the three models.The first two models were used to calculate the classification reliability by using the physician marked lump region,and the third model was used to calculate the classification reliability by using the breast lump region marked by semantic segmentation model.The performance and classification reliability of the three models were compared.Results:A total of 5358 mammography images conforming to the requirements were collected in this study,including 1161 mammograms in the benign group,1421 mammograms in the malignant group and 2776 mammograms in the normal group.The results are as follows:(1)Baseline data analysis showed that the mean age and grade distribution of breast tissue density in the malignant group were significantly different from those in the other two groups(P < 0.05),the malignant group had a higher age of disease and a higher proportion of high-density breast tissue.(2)The performance of the three models is as follows: the AUC of the benign and malignant classification model of the original mammography images is 0.88(95%CI = 0.85-0.91),the accuracy is 0.76,the accuracy is 0.77,the recall rate is0.82,and the F1 metric value is 0.79;The AUC value of the benign and malignant classification model of breast mammography images after artificial resection was0.70(95%CI = 0.66-0.74),the accuracy was 0.64,the accuracy was 0.69,the recall rate was 0.63,and the F1 metric value was 0.66.After semantic segmentation,the AUC value of the benign and malignant classification model of breast mammography images was 0.83(95%CI = 0.80-0.86),the accuracy was 0.75,the accuracy was 0.78,the recall rate was 0.75,and the F1 metric value was 0.76.(3)The image classification reliability of the test set among the three models is as follows: The classification reliability results of the benign and malignant classification models of the original images are all less than 0.10,and most of them are concentrated around 0.There is no significant difference in the pixel accuracy distribution among the four groups(P > 0.05);The classification reliability of benign and malignant breast mammography images after artificial excision was greater than0.5,which was significantly different from that of the wrong images(P < 0.05);After semantic segmentation,the classification reliability of benign and malignant breast mammography images was more than 0.4,which was significantly different from that of the wrong images(P < 0.05);Compared with the original image molybdenum target classification model,it is significantly higher than that of the original image molybdenum target classification model(P < 0.05),no significant difference was found between the classification model of breast mammography image after artificial resection(P > 0.05).Conclusions:(1)There were significant differences in age and breast density in patients with breast cancer compared with patients with benign mass and normal population.(2)In this study,two kinds of reliability analysis methods for benign and malignant classification of breast mammography images were established.(3)The performance of the benign and malignant classification model after semantic segmentation is higher than that after artificial segmentation,which is similar to the benign and malignant classification model of the original image,but the classification reliability is much higher than that of the original image,and the overall performance is the best.(4)The semantic segmentation model of breast and breast mass were established in this study. |