Font Size: a A A

Research On Image Inpainting Technology Based On Cross-Scale Attention Mechanism

Posted on:2024-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y NiFull Text:PDF
GTID:2568306941959979Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the important media for saving and sharing information,images’ integrity can significantly affect people’s perception of information.As a result,image inpainting technology has become an increasingly challenging research topic in the field of computer vision.Image inpainting technology aims to infer and fill in missing content based on the known content in the image to meet people’s visual needs and the requirements of information transmission.The complete image inpainted by the network model should maintain semantic consistency globally and maintain fine texture details locally.In recent years,deep learning-based inpainting methods have made significant progress compared to traditional inpainting methods.However,there are still issues with blurred textures and unsatisfactory fusion of structure and textures.In order to address the limitations of current image inpainting methods and achieve high-quality inpainting results of damaged images,the following research are carried out in this paper:(1)In order to address the problems of unsatisfactory fusion of high-level semantic structures and low-level surface textures in the existing image inpainting results,as well as the existing network’s neglect of the powerful generating ability of the decoder side,this paper constructs an end-to-end image inpainting network based on a dual-path cross-scale attention mechanism.In this network,The network adopts UNet as the basic framework and designs a cross-scale attention module as the core to achieve the cross-scale transmission of high-order structural information to adjacent shallower feature maps.The cross-scale attention module is introduced into each layer of the encoding path and decoding path at the same time,which can fill in the missing features in the shallow layer feature map of the encoding path,and refine the features generated by upsampling in the decoding path.This design can provide higher-quality input for the decoder layers to obtain better inpainting results.Finally,experiments are conducted on the Fa?ade dataset and CelebA-HQ dataset for validation and evaluation,and the results show that the proposed network model has better performance than the existing methods.(2)In order to achieve more refined surface textures in the inpainting results and to make the network capable of handling both local texture features and global structural features,this paper proposes a image inpainting network model based on CNN and transformer.This end-to-end network is composed of a encoder,symmetric decoder,and skip connection.The encoder is composed of a CNN-based feature encoding module and the transformer modules.The input damaged image is first processed by stacked convolutional layers to extract multi-scale feature maps.The extracted features are then processed by the transformer modules to learn global semantic structural features.Finally,the decoders use multiple skip connections and cross-scale attention modules in the upsampling process to fully utilize the feature information from each stage and gradually restore the image to the original resolution.Experiments conducted on the Fa?ade dataset and CelebA-HQ dataset show that the improved hybrid network model has superior performance and can generate rich texture details while maintaining the overall semantic structure of the image.
Keywords/Search Tags:Image inpainting, contextual attention, Transformer, deep learning, con-volutional neural network
PDF Full Text Request
Related items