Font Size: a A A

Deep Learning Fusion Algorithm For The Classification Of Benign And Malignant Pulmonary Nodules

Posted on:2021-04-25Degree:MasterType:Thesis
Country:ChinaCandidate:N TangFull Text:PDF
GTID:2404330611495871Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Background and significance: lung cancer is the leading cause of cancer-related death in the world.Because of its asymptomatic growth,lung cancer patients are often diagnosed in the terminal stage.The epidemiological investigation shows that the 5-year survival rate of early lung cancer patients can reach 56%,while the 5-year survival rate of late lung cancer patients is only 5%.Therefore,the early screening of lung cancer is the key means to improve the survival time of lung cancer patients,and the relevant research shows that the low-dose CT screening can reduce the mortality rate of high-risk lung cancer patients by 14-20%.Due to the relative lack of radiologists,a large number of clinical imaging data has increased the burden of radiologists,and in the case of long-term,high-intensity reading,it is easy to appear inaccurate interpretation,instability and other problems,so people began to develop computer-aided diagnosis system,hoping to reduce the burden of radiologists and improve the efficiency of clinical diagnosis.The development of computer-aided diagnosis mainly goes through two stages.The first stage is mainly based on the machine learning methods with image features as input(also known as Radiomics).The second stage is mainly based on the deep learning methods with image as input directly.Existing problems and research purposes: the rise of deep learning in recent years has brought new opportunities to the field of computer-aided diagnosis(CAD),but at the same time,the application of deep learning methods in medical imaging,as a new discipline,is still in the initial stage of development,and there are still some problems to be solved: on the one hand,at present,many algorithms only analyze unstructured data of patients,such as CT,MRI,X-ray,pathological slices,etc.,while ignore some structured data of patients,such as clinical baseline data,disease history,genetic history,laboratory examination,etc.,which are also important basis for making accurate judgment of a patient's condition,so how to integrate multimodal data of patients to make diagnosis of the disease will be a challenge;on the other hand The deep learning method is an algorithm that can automatically extract image featuresand carry out self-learning,so people focus on the improvement and transformation of the deep learning models,and ignore the influence of feature engineering on the model performance,especially in the field of medical image.The tissue around the region of interest will greatly interfere with the model,so how to design good features to reduce the influence of these interferences on the model performance is another problem to be solved.Therefore,in view of the existing problems,this paper has carried out two researches,one is to build a fusion model,trying to integrate structured and unstructured data of patients to improve the performance of pulmonary nodule classification;the other is to explore the impact of different scales and modes of pulmonary nodule images on the classification effect of deep learning models,and explore the feasibility of a new pulmonary nodule image mode.Research content and results: In this study,the problems of current deep learning methods in computer-aided diagnosis are studied as follows:(1)To solve the problem that the current algorithm can not make full use of patient information,two fusion algorithms,SUDFNN and SUDFX,are proposed.The algorithm can jointly model structured data and unstructured data of patients,and make a more comprehensive diagnosis of diseases by mining effective information from multi-modality data.Using the structured data in LIDC-IDRI annotation file and the CT image data in LUNA16 data set,we extracted 684 3D images of lung nodules and their corresponding 9 structural features.The experimental results show that compared with the algorithm using only image data,the addition of structural features can significantly improve the classification performance of pulmonary nodules.The best comprehensive index of the model can reach: accuracy 92.6%,sensitivity 91.9%,specificity 93.4%,area under ROC curve 0.971.(2)To solve the problem of feature engineering in deep learning,this paper discusses the influence of different scales and modes of pulmonary nodule images on the performance of model classification,and proposes a 2D multi-view fusion method for pulmonary nodule image processing.Compared with the traditional 2D method,this method can obtain more pulmonary nodule information and introduce less interference tissue.In order to verify the model,we preprocess LIDC-IDRI and LUAN16 datasets,getting four different modes of pulmonary nodule images,namely 2D,3D,2D full-view fusion and 2D multi-view fusion,at 16,25 and 36 scales.Then four models,2D CNN,3D CNN,2Dfull-view fusion convolutional neural network and 2D multi-view fusion convolutional neural network,were constructed.Using the above samples to train and verify the model,the final results show that the lung nodule image in 2D multi-view fusion mode has better classification performance compared with other image mode,which achives the accuracy of92.8%,sensitivity of 91.3%,specificity of 93.6%,area under ROC curve 0.963;in regard of multiple scale images,the classification performance in small scale is relatively better.Conclusion:(1)Compared with the model using only image data(unstructured data),the introduction of structured data can improve the performance of classification;(2)Structured data can capture the multiple heterogeneity among pulmonary nodules to identify them;(3)The feature engineering of deep learning model has a great impact on its classification performance,and 2D multi-view fusion image can obtain more information of pulmonary nodules and introduce less interfering tissues,which can significantly improve the classification performance of model.In this paper,two fusion model algorithms,SUDFNN and SUDFX,are proposed,which can effectively combine structured data and unstructured data.Then we discuss the influence of different scales and different modes of pulmonary nodule image on the performance of model classification.Based on those methods,a 2D multi-view fusion image mode is proposed,which can improve the classification performance with good expansibility.These fusion model algorithms and fusion image patterns not only enrich and expand the content and ideas of deep learning and application research,but also lay a good foundation for the construction of the follow-up medical big data analysis method system,which has important academic theoretical significance and potential application value.
Keywords/Search Tags:Lung nodule classification, Convolutional neural network, Extreme Gradient Boosting, Computer-aided diagnosis, Multi-scale and multi-mode image
PDF Full Text Request
Related items