| Medical imaging has unique advantages in the diagnosis of certain diseases.It is becoming more and more popular in medical institutions at all levels,which has greatly improved the quality of people’s medical care.However,what comes with it is a large demand for radiologists.Due to the long training time of radiologists,it is difficult to meet the rapidly growing demand,resulting in the overloaded work of radiologists,and radiologists are easily fatigued,leading to misdiagnoses.In this context,the technology of assisting radiologists to read medical images via computers came into being,i.e.,computer-aided diagnosis,and with the emergence of deep learning,computer-aided diagnosis has achieved rapid development and great success.However,it still faces the following challenges in the application process.1.How to eliminate the effect of domain shift?Medical imaging devices from different manufacturers have their own private and unique imaging algorithms,which results in different image styles,also known as different domains.When a deep learning model is applied to a domain that is not in the training set,its performance drops dramatically.2.How to perform multimodal fusion?The diagnosis of some difficult diseases needs to rely on more than one medical imaging examinations,and the images of different modalities are quite different.How to effectively integrate the information of multiple modalities has not been well resolved.3.How to use expert knowledge effectively?The reading of medical images relies heavily on the experience of experts.At present,the experience and knowledge of experts are not fully utilized in the deep learning system.If this expert knowledge can be used comprehensively,the effect and interpretability of the deep learning system will be improved.To address the above three challenges encountered by the current computer-aided diagnosis system of medical images,this dissertation studies and proposes solutions respectively.Details are as follows.1.A generative domain adaptation method is proposed to solve the generalization problem across different domains.This method adapts different domains to a virtual common one,which is still at the appearance level.Then different original domains are aligned in the common domain.To this end,a domain-shared generator is used to transform the input images and two competitive discriminators to adversarially supervise the transforming process.In the proposed method,domain alignment is performed at the image level,while the learning of downstream tasks is performed at the feature level,which looses the entanglement of these two tasks and reduces their mutual interference.An auxiliary holistic perceptual loss is also added to supervise the generator to keep anatomical information.Experiment results on a large-scale dataset containing 46k chest X-ray images demonstrate that GDA outperforms representative domain adaptation methods by a large margin for both disease classification and lesion detection.In addition,for CT(Computed Tomography)images,an adversarial frequency alignment method is further proposed based on the imaging principle of CT images.This method first transforms the original image into the frequency domain,then projects different images into a common virtual frequency domain through an adaptive transition module,and then transforms them back to spatial domain images.To address the multi-domain adversarial training problem,this dissertation also proposes a random domain adversarial training strategy.Our method outperforms representative domain generalization methods on both public and in-house datasets.2.A collaborative attention network is proposed to address the multi-modal learning problem.Different modalities of medical images hold great differences,such as 2D and 3D,macro and micro(pathological images).Thereby,the alignment,interaction,and fusion of different features should be simultaneously considered when processing multiple modality data.Depending on this scheme we propose the collaborative attention network and apply it to the diagnosis of gliomas based on MRI(Magnetic Resonance Imaging)images and pathological images.The network consists of three modules,i.e.multi-instance attention,cross-attention,and attention fusion.In order to align the features of the two modalities,the pathological image is first divided into patches and the noisy patches are filtered out by a noise reduction algorithm,then the multiinstance attention is applied to learn an adaptive fusion weight for each patch,which is then used to obtain features that have the same dimension as the MRI features.After the feature alignment of the two modalities,the cross-attention module is employed to implicitly capture the relation between the two modalities and enhance both features by the complementary information from the other modality.The learned cross attention matrixes imply the feature reliability,so they are further utilized to obtain a coefficient for each modality to linearly fuse the features as the final representation in the attention fusion module.The three attention modules are collaborative to discover a comprehensive multi-modal representation.The proposed method surpasses other multi-modal fusion methods on the public CPM-RadPath dataset.3.An expert knowledge infused method is proposed for the severity assessment of COVID-19.The diagnosis knowledge of COVID-19 is utilized in devising the framework,and the feature engineering.Specifically,we first divide the lung CT image into five lobes and obtain the positions of COVID-19 lesions with a coarse-to-fine segmentation model.According to radiologists’ experience,a novel post-processing strategy is proposed to improve the performance of lesions’ segmentation.Then the initial assessment score can be computed according to the definition of COVID-19’s severity.Since the lesion segmentation results are not completely equivalent to the degree of infection,and the segmentation model will bring some false results inevitably,we introduce some auxiliary artificial features based on expert knowledge as compensation.These features describe the texture and intensity of the lung,which are derived from the imaging manifestations of COVID-19.A multi-layer perceptron is adopted to predict another assessment score with these conventional artificial features.Finally,the two predicted assessment scores are fused to get the final score.The proposed method outperforms junior radiologists on our clinical COVID-19 dataset.Based on the above research,we developed computer-aided diagnosis systems by deploying relevant deep learning models,such as pulmonary nodules and COVID-19.The systems have been successfully applied in many medical institutions and have been widely praised. |