Font Size: a A A

Research On Infrared And Visible Image Fusion Method Based On Depth Neural Network

Posted on:2023-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:C X XuFull Text:PDF
GTID:2568306614993449Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the information age,image data has become an important source of information for people and plays an increasingly important role in the field of data and information processing.Due to the shortcomings of insufficient information and incomplete targets in images captured by a single sensor,multimodal image fusion technology comes into being.Image fusion technology is able to integrate image information captured by different types of sensors,which can maximize the retention of image feature information,and multimodal images are more valuable than single-modal images in subsequent advanced vision applications.Common application areas include medical MRI and CT image fusion,remote sensing MS and PAN image fusion,and military infrared and visible image fusion.IR and visible image fusion technology has important applications and research value in security surveillance,military reconnaissance and industrial inspection,etc.A lot of research work has been carried out by domestic and foreign experts for infrared and visible image fusion.In the traditional image fusion methods,it is usually difficult to design image transformation and fusion rules manually,as well as the limitation of computational volume and implementation difficulty,which leads to the limitation of traditional fusion methods.There are differences in the respective imaging principles and wavelength spectra of infrared and visible images,and the fusion of infrared and visible images using deep neural network techniques can be considered.Therefore,in this thesis,we investigate the fusion of infrared and visible images using deep neural network technology,and propose an infrared and visible image fusion method based on residual attention generation adversarial network and gradient residual transformer.The main work of this thesis is as follows.(1)Research on the image fusion method based on Residual Attention Generative Adversarial Network(RAGFuse).Since infrared images and visible images have different representational characteristics and are susceptible to interference from the external environment,the existing generative adversarial network-based image fusion methods do not consider this problem well.In addition,channel concatenation operation is usually used for multimodal data input,which may miss some important feature information of the image to some extent.To address the above problems,this thesis proposes the RAGFuse fusion method,which combines the generative adversarial network with the residual network,and proposes the RBAM(Res Net Block Attention Module,RBAM)module to consider the locally significant features of the image from both channel and spatial perspectives.At the same time,the network model is improved into a dual input to ensure that the feature maps corresponding to the two input images contain the salient target and detailed content information of the source images.In addition,the total variation loss is added to the loss function design to improve the training performance and the generalization ability of the model.RAGFuse method is experimented on the public datasets TNO and INO,and by comparing with eight advanced methods,this method demonstrates better fusion results in the subjective qualitative evaluation,and has different degrees of improvement in the objective quantitative evaluation EN,SD,SSIM,and MI.(2)Study of unsupervised image fusion method with gradient residual transformer(GRTFuse)In the end-to-end unsupervised image fusion framework,because the convolution operation in the deep neural network acts as the extracted features,more attention is paid to local features and less consideration is given to global features,and the gap between infrared and visible images in terms of contrast and texture details is large,and the fine-grained feature information of the source image is not retained enough.To address the above problems,this thesis proposes the GRTFuse method,which adopts an encoder-fusion layer-decoder structure,where the encoder performs feature extraction,the fusion layer performs feature fusion,and the decoder performs feature reconstruction.In the encoder,the gradient operator is introduced to focus on the fine-grained feature information of the image by combining the dense block and the residual unit.In addition,the axial-transformer is introduced to better capture the global features so that the fused image retains both global features and focuses on the detailed local range.GRTFuse method is evaluated in both subjective qualitative and objective quantitative aspects by conducting comparison experiments on the publicly available dataset TNO.Among them,the objective quantitative evaluation metrics SCD=1.811,SSIM=0.760,FMIdct=0.358,and MS_SSIM=0.921 demonstrate superiority compared with six traditional methods and two deep learning methods as well as the RAGFuse method.
Keywords/Search Tags:image fusion, infrared images, visible images, generating adversarial networks, encoder, decoder
PDF Full Text Request
Related items