Transformer And MAE Based Research On Multimodal Medical Image Fusion Methods

Posted on:2024-09-24

Degree:Master

Type:Thesis

Country:China

Candidate:J Zhang

Full Text:PDF

GTID:2544307058955959

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Due to the different imaging principles of medical images with different modes,the information representation effects of different organs or tissues are also different.For example,CT images have better imaging effects on organs such as bones and liver,but poor contrast performance between different soft tissues,while MR Images have higher resolution soft tissue details.It responds well to blood and metabolic changes in the brain and spinal cord,but is not as good at spatial resolution.Each of these modes has its own advantages and limitations.It is difficult for a single mode image to contain all the key information of the current focal area.However,by synthesizing complementary and redundant information between different modal medical images,multi-modal medical image fusion technology effectively solves the limitations of single modal imaging for human tissue and organ information,improves the utilization efficiency of medical image information,and helps medical workers realize more accurate diagnosis and treatment.Through in-depth study of multimodal medical image fusion theory,Transformer network structure and MAE mask pre-training strategy,this paper clarified the existing problems and made improvements.The main contents are as follows:Aiming at the shortage of global feature representation in the existing multi-modal medical image fusion methods based on deep learning,this paper proposes a medical image fusion method based on local global feature coupling and cross-scale attention.The method consists of encoder,fusion rule and decoder.In the encoder,the parallel CNN and Transformer double branch networks are used to extract the local features and global representation of the image respectively.At different scales,the feature coupling module is used to embed the local features of CNN branch into the global feature representation of Transformer branch to combine complementary features to the maximum extent.At the same time,cross-scale attention module is introduced to effectively use multi-scale feature representation.The encoder extracts the local,global and multi-scale feature representations of the original image to be fused,fuses the feature representations of different source images by fusion rules,and then injects them into the decoder to generate fusion images.In this paper,the encod-decoder network is trained for image reconstruction tasks,which avoids the need for large amounts of pre-registration of medical image data sets.Aiming at the problem that the training task in the multi-modal medical fusion method of local global feature coupling and cross-scale attention is relatively simple,the network cannot obtain high-level feature representation to meet the needs of the image fusion task,a multimodal medical image fusion method based on MAE pre-training is proposed.The method is divided into two stages: pre-training and fusion.In the pre-training stage,MAE mask reconstruction task is used to train the codec network.In the fusion stage,a feature fusion module based on self-attention is designed to replace the manual fusion rule,and the network parameters are fine-tuned for the multi-modal medical image fusion task.Finally,the complete image fusion model is obtained.The encoder is used to extract the features of the source image,and the feature fusion module combines the features of different images.Finally,the fusion image is obtained by the decoder reconstruction.For the two networks proposed above,relevant experiments were carried out using Py Charm and Matlab respectively,and the experiments were compared with the newly proposed image fusion methods.The fusion results obtained by the method in this paper have good performance in terms of objective indicators and subjective vision,which reflects the effective implementation of multi-modal medical image fusion work.

Keywords/Search Tags:

medical image fusion, Transformer, feature coupling, cross-scale attention, MAE

PDF Full Text Request

Related items

1	Study On Medical Image Segmentation Algorithm Based On Attention
2	Research On Medical Image Segmentation Based On Transformer
3	Retinal Vascular Segmentation Algorithm Based On Multi-scale And Attention Mechanisms
4	Research On Registration And Fusion Methods Of CT And MRI Image
5	Research On Transformer-based Deep Learning Medical Image Organ Segmentation Method
6	Breast Disease Image Classification Based On Improved Convolutional Neural Network And Multi-scale Feature Fusion
7	Research On Multiple Classifications Of Medical Images With Multiple-scale Features
8	Method Research On Medical Image Segmentation Based On Multi-path Attention Fusion
9	Research On Medical Image Recognition Based On Transformer
10	Research On The Segmentation Algorithm Of Organs And Lesions In Medical CT Image Based On Deep Learning