Font Size: a A A

Breast Mass Detection From The Digitized Mammograms Based On Machine Learning

Posted on:2020-11-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:R B ShenFull Text:PDF
GTID:1364330629982977Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Breast cancer is one of the most leading cause of cancer deaths among women worldwide.How to find it early,standardize the diagnosis and reduce the death rate is a major challenge faced by the medical community all over the world.Mammography is the most popular method of early breast cancer detection,which is widely accepted as the most effective and most reliable medical examination method that can find the clues of early breast cancer,such as mass,microcalcification and glandular structure distortion,etc.With the help of mammography,early prevention can significantly reduce the mortality of screened people,in developed countries with widespread mammography,the survival rate of breast cancer can reach more than 80 %,while in the developing countries with relatively poor heathcare,the survival rate of breast cancer is lower than 40 %.Currently,in the absence of effective prevention for breast cancer,how to early detection of breast cancer is an important way to fight breast cancer.In recent years,with the rapid development of computer technology,the digital computer aided diagnosis system has become the development trend of medical technology.Therefore,it also become an inevitable development trend for breast cancer diagnosis.Digital mammography not only has clearer image resolution,but also can be easier to save,retrieve and transmit.Additionally,the computer can analyze and identify the diagnostic information that cannot be perceived by human eyes.In recent years a large number of computer-aided detection technology based on mammogram have been proposed,and many of them are partly applied to clinical diagnosis.In the latest studies,the traditional algorithms have many deficiencies,such as some algorithms are limited to certain types of lesions in mammogram,some algorithms have higher false positive rate,some algorithms need to provide more human intervention,and so on.Therefore,it is urgent to develop the automatic,more accurate and diversified computer-aided detection system based on mammography.It is a challenging task to automatically and quickly detect abnormal mass lesions from the digitized mammograms.Mammogram is a kind of grayscale medical image.In the process of designing and implementing computer-aided mass detection system,the following problems exist.1.The masses presents various sizes and shapes in mammogram,which are usually embedded and surrounded by different normal tissues with similar density.It is difficult to achieve well detection performance by traditional detection techniques with manual designed features.2.In the process of modeling the mass detection system,a large amount of well-annotated mammograms are needed,and the cost of such medical image data is very high.Therefore,how to reduce the requirement for annotated dataset is an urgent problem.3.Due to different acquisition equipment and parameter configuration,the mammographic datasets from different sources are quite different from each other.How to learn the features of the lesion regions from the precisely annotated datasets and transfer to the unannotated datasets is a problem that needs to be solved to avoid the annotation efforts of the target unannotated datasets.4.As the mass lesion in mammogram occupy a relatively small region,the class imbalance problem will be encountered in the identification process.For the above problems,this paper carries out the following researches.First,in order to extract better feature represents for mass detection task,the outline and the background information around mass lesions are used to expand the feature represents,therefore a mass detection computer aided diagnosis(CAD)system based on multi-context multi-task learning(MCMTL)networks is designed.The CAD system includes a suspicious region localization(SRL)module and a multi-context multi-task learning(MCMTL)networks module.The SRL employs region proposal methods to find suspicious regions(regions of interest,ROIs)in mammogram.Patches are extracted from the suspicious regions and are further classified into positive or negative by the MCMTL networks.The MCMTL networks can integrate features from multi-size patches of suspicious regions for classification and segmentation simultaneously.Specifically,the MCMTL networks incorporate multi-context learning(MCL)and multi-task learning(MTL),which are jointly optimized in an end-to-end manner.According to the evaluation results on the public datasets,the proposed MCMTL networks significantly ourperform the counterparts,and the mass detection CAD system based on MCMTL networks achieves 0.812 TPR@2.53 FPI and 0.919TPR@0.12 FPI performance,respectively,which also outperforms the state-of-the-art methods.Second,in order to solve the absence of mammographic annotation,a mass detection method that incorporates deep active learning(DAL)and self-paced learning(SPL)is proposed.This method selects a few of most informative samples from large amount of unannotated samples to perform annotating,which dramatically reduces the requirement of the annotated samples.Besides,this method provides a meore ffective training strategy.An informativeness query algorithm is designed to rank the large amounts of unannotated samples in DAL,by considering the uncertainty and diversity of samples.Furhtermore,a self-paced sampling strategy is employed to selected a few of most informative samples,by considering the complexity and pace.According to the experimental results on the selfcollected dataset,the proposed method achieves 0.9220 PAUC and 0.9643 TPR@2.0FPI,which outperform the active learning method and random sampling method.The proposed method only annotates about 20% samples of all training dataset.Third,in order to solve the absence of mammographic annotation and to minimize the cost of annotation,an unsupervised domain adaptation method with adversarial learning for mass detection is proposed,which can learn the features of lesions from well-annotated dataset and migrate them to the unannotated dataset.This method employs a task specific fully convolutional network(FCN)for spatial density prediction.Moreover,a domain discriminator is designed,in which adversarial learning is adopted to align the less-annotated target domain features with the well-annotated source domain features in the feature space.In order to further improve the performance of the model,a novel training strategy for the adversarial learning is proposed,which can overcome the problems of too small batch size and oscillation.In the experiments,the proposed method achieves 0.9083 PAUC and 0.9479TPR@2.0FPI,which outperform the state-of-the-art methods.The proposed training strategy is also proved to converge much faster.In addition,experimental results also show that the proposed unsupervised domain adaptation method using more unannotated samples can achieve comparable performance with the supervised method which uses less annotated samples.Fourth,to address the class imbalance problem in medical image analysis,a GAN based method is proposed to perform synthetic sampling in feature space,i.e.,feature augmentation,which could solve the negative influence of the class imbalance of samples in the training process of classification model.A feature extraction network is first trained to convert images into feature space.Then the GAN framework incorporates adversarial learning to train a feature generator through playing a minimax game with a discriminator.The feature generator then generates features from arbitrary latent distributions,for simulating real image features.Finally,a data cleaning technique is employed to cleanup undesirable conflicting features,and thus a well-defined class balanced feature set is obtained,which can improve the class imbalance learning.According to the experimental results on two medical image analysis tasks,the proposed method achieves superior or comparable performance over the state-of-the-art counterparts.
Keywords/Search Tags:Machine Learning, Mammogram Analysis, Mass Detection, Multi-context Multi-task Learning, Active Learning and Self-paced Learning, Unsupervised Domain Adaptation, Class Imbalance Learning
PDF Full Text Request
Related items