Mammography is widely used in breast cancer diagnosis.The diagnosis is susceptible to the doctor’s diagnosis skill because image reading and diagnosis on mammography require doctors have profound medical knowledge and rich diagnosis experience.Computer aided automatically mammography analysis and diagnosis of breast cancer not only provides doctors with objective diagnosis suggestions but also saves valuable medical resources and improves the performance of breast cancer diagnosis.In recent decades,researchers have proposed many machine learning algorithms on breast cancer mammography diagnosis.The main works of these algorithms are to delete the noisy areas such as chest muscle on mammography,design breast mass feature extractors and classifiers to obtain a better diagnostic performance.There are some problems with these algorithms:1)It is difficult to detect and classify breast masses due to the mammography is very fuzzy and the breast mass overlaps with other human tissues.2)The potential features of mammography are not fully mined because the existing algorithms do not make full use of the multi-view characteristics of mammography.3)A large number of unlabeled mammography have not been utilized effectively.In this paper,we have conducted research on the diagnostic techniques of mammography breast cancer using image processing and machine learning technologies.The research contents and innovations of this paper are summarized as follows:1)In view of the fact that it is difficult to design the breast mass detection algorithm due to the fuzzy edges and textures of breast masses and the complex backgrounds on mammography,we proposed a breast mass detection algorithm based on mathematical morphology method and image template matching algorithm to improve the detection performance of breast mass on mammography.The proposed algorithm transforms the area of suspected breast mass on mammography to a circular region by the mathematical morphology method,and locates the circular area by the image template matching algorithm with a circular image template,then confirm whether these suspected areas contain breast mass using a classifier based on CNN to identify the areas of breast mass.The TPR of mass detection of the proposed algorithm on DDSM is 96%,and the FPI is 0.53,which are better than all of the comparison algorithms.Finally,the bounding box of the breast mass is optimized by particle swarm optimization algorithm with the fitness function based on the CNN classifier of suspected breast mass area.2)In view of the problem that it is difficult to detect and classify breast masses due to the unobvious features of breast mass on mammography,we proposed a correlation coefficient attention mechanism module to enhance the ability of feature extraction from mammography.The proposed module automatically learns the weights of all of the feature maps from a convolutional layer according to the contributions of these feature maps for breast mass detection and classification task,then applies these weights to the feature maps to enhance the weights of the feature maps with large contribution and suppress the weights of the feature maps with small contributions to the breast mass detection and classification task.The proposed attention mechanism module first generates the correlation coefficient matrix by calculating the correlation coefficients of all of the feature maps output from the convolutional layer,and then calculates the weights to these feature maps according to the contributions of these feature maps for the breast mass detection and classification tasks from the correlation coefficient matrix by two full-connected layers,finally these weights are applied to all of the feature maps to improve the feature extraction ability of the model.The breast mass detection and classification ability of the two-stage and one-stage object detection framework has been improved by applying the proposed correlation coefficient attention mechanism module into the backbone network in the object detection frameworks.3)In order to explore the potential feature of mammography to improve the classification performance,we proposed a multi-view convolutional neural network to extract the complementarity features from mammography with MLO and CC views.In addition,the proposed method improves the ability to distinguish benign and malignant breast masses by enhancing the weights of the feature maps with a large contribution to the classification task.Meanwhile,we added a penalty term based on the Fuzzy C-means algorithm to the cross-entropy loss function to enhance the generalization ability of the model by maximizing the interclass distance and the intraclass distance.The accuracy,sensitivity,specificity,F1 Score and AUC of the proposed algorithm on DDSM reached 78.39%,82.69%,74.07%,78.89%and 0.8347,respectively,which are better than all of the comparison algorithms.4)In order to use the unlabeled mammography for breast mass classification model training,we train the breast mass classification model on the labeled breast mass images,then use this model to predict the unlabeled breast mass images,and assign the prediction as the pseudo-labels to the unlabeled samples.To avoid using low-confidence pseudo-label samples for classification model training,we used selfpaced learning and K-means algorithms to gradually add high-confidence pseudolabel samples to the training dataset.In addition,the consistency of the prediction results of the MLO view and CC view is added as a penalty term to the loss function to improve the ability to extract the discriminative features from breast mass images by the model.Meanwhile,we extract the fine-grained features from the multi-view breast mass images by the compact bilinear pooling technology to improve the classification performance of the model. |