| Image fusion refers to the process of integrating the image information of a scene collected by different sensors,which is a subdivision field of information processing.Image fusion technology aims to improve image quality and geometric registration accuracy or signal-to-noise ratio.It can effectively overcome the incomplete image data in target extraction and recognition.The efficient image fusion method can comprehensively process the information of multi-source channels,effectively improve the availability of images,and improve the discrimination accuracy of an image processing system for target detection,target recognition,content understanding,and so on.This thesis focuses on the two popular fields of image fusion:multi-focus image fusion and multi-modal medical image fusion,the main works of this thesis are following:(1)In the field of multi-focus image fusion,an adversarial generation network multi-focus image fusion method based on preference learning is proposed.In this method,the brightness component in HSV color space is used as the input of the generator,and the generator outputs the focus probability about pixels to complete the fusion.The discriminator will form adversarial training by distinguishing the output of the generator from the real label,so as to improve the performance of the generator.For the problem of the loss function in other works,because the calculation in the global space cannot prevent the residual error from propagating between pixels,such that the extracted feature of the network lack of structural group sparse.This thesis designs an effective loss function to train the model.Among them,l21 norm with structural group sparseness is employed to construct a focus fidelity loss,to regularize the adversarial loss of the generator,whereas the discriminator loss is borrowed from WGAN-GP for the stability of training.In addition,a deep learning strategy named preference learning is designed to train the model.It can assign different learning weights to each sample in the training stage.For samples with learning difficulty,it will give greater learning preference,so that the network can learn more features from difficult samples.(2)In the field of multimodal medical image fusion,in view of the traditional method and part of the deep learning method cumbersome manual design,and cannot apply to the shortcomings of multi-modal medical images,this thesis proposes a new unsupervised learning multimodal medical fusion network,which has no complex manual design and can effectively fuse multimodal medical images at the same time.The CNN-based networks cannot effectively capture the global context information and establish the long-distance dependence of image information.To solve this problem,this thesis innovatively introduces Vi T architecture and combines the Vi T model with the CNN model as a hybrid feature extraction model,which is helpful to extract more valuable features.Existing methods do not emphasize the complementarity between the source image and the fused image,it limits the fusion performance to some extent.This method designs a new complementary information fidelity loss,which not only enhances the complementary information in the fusion result,but also the fused image can retain more information from the source image.Moreover,the problem of brightness degradation in the process of medical image fusion is solved without weight design for the first time.Extensive experiments are established to verify the effectiveness of the above two kinds of image fusion methods.The experimental results show that the proposed method can achieve better performance than the existing methods both in terms of vision and objective metrics in the corresponding fusion tasks. |