| Image fusion has received a lot of attention in recent years as a powerful computer vision assisted technique.Due to the limitations of the camera’s depth of field,it is difficult to obtain a fully focused image.Specifically,targets within the depth of field remain sharp,while the contents of the scene outside the depth of field are blurred.The goal of multi-focus image fusion is to fuse groups of multiple images with different focus areas into a single image with a full-focus area.Compared to a single image,the fused image has more texture and detail,which is more in line with human visual perception and can be further applied for detection,recognition,etc.For multimodal medical image fusion,it is difficult to include different categories of information in a single image due to the limitations of the imaging sensor and imaging mechanism,e.g.CT images contain skeletal information and MRI images are rich in soft tissue information.Fortunately,the information in different modality medical images is largely complementary and it is necessary to combine medical images from different imaging modalities to obtain more comprehensive information about the diseased tissue or organ.Medical image fusion is the integration of information contained in multiple medical images in different modalities to provide effective technical support for various clinical applications,such as disease diagnosis and medical planning.For multi-focus image fusion,two fusion algorithms are proposed in this paper,namely a multi-focus image fusion algorithm based on mutual coupling network and a multi-focus image fusion algorithm based on cross-fusion network.For multimodal medical image fusion,this paper proposes an information-guided multimodal medical image fusion algorithm to measure the contribution of the features of each convolutional block to the decoder.The main work of this paper is summarised as follows.First,for existing deep learning-based multi-focus image fusion methods that only connect complementary information of the original image at the front or end of the encoder,however,these methods weaken the focus prediction capability of the decoder.A fully convolutional network is proposed to address the shortcomings of existing methods by learning the conditional focus probability of a pair of multi-focused image fusion tasks in a supervised manner.In particular using coupled connection blocks,the encoder can more robustly extract conditional focus features for ech layer,allowing the decoder to further give more robust focus predictions.Furthermore,in order for the decision map output by the network to be closer to the label of the original image,a hybrid loss is designed to train the network,where the structural sparsity fidelity loss is used to encourage the conditional focus probability map to be closer to its corresponding Ground truth,while the structural similarity loss is used as a regular term to further enable important structural pixels to have fewer prediction residuals.Experimental results show that our method achieves start-of-the-art compared to the comparison methods.Secondly,in response to existing methods that only perform simple stitching fusion of features extracted from convolutional blocks,which may result in the loss of complementary information of the original image.For this reason,a new multi-focus image fusion network retains the complementary information of the original image.In this regard,the element addition method emphasizes the complementarity of features,the element multiplication method emphasizes the commonality of features,and the attention mechanism to evaluate the contribution of features at different scales to the fusion result.Further,dice loss is also used as a constraint term to encourage more intersections between the decision map and ground truth in the output of the network and to improve the accuracy of focused region detection.A comparison with five typical fusion methods on the "Lytro" dataset shows that our method achieves significant improvement.Finally,unsupervised multimodal medical image fusion methods for unsupervised medical images fuse low-level features with high-level features as decoder input through a Densenet structure to exploit the useful information at multiple levels.However,these methods integrate encoder features directly through Concatetation without considering the extent of their contribution to the fusion result.In this paper,a new framework for multimodal medical image fusion is designed,In the framework,a pre-trained encoder based on VGG-16 with rich a priori knowledge is used to achieve satisfactory performance even on untrained tasks.In this paper,an information-guided model(IGM)is designed to compute the contribution of the encoder’s features at each layer to the decoder.Furthermore,this paper proposes a twinned multi-scale crossattention fusion module(SMSCAFM)to integrate complementary information from both coding branches.In particular,elemental enhancement tunes the commonality between features.Further,this paper introduces saliency weights(SW)to constrain the similarity between the fused image and the two original images,including luminance and structure,and verifies its effectiveness through ablation experiments.Ultimately,this paper conducts extensive experiments on ten types of multimodal medical images GAD&MR-T1 and PET&MR-T2,and compares them with nine typical medical image fusion methods published in the last three years,and the results show that the IGNFusion method in this paper reaches a new level. |