Font Size: a A A

Research On Face Super-resolution Based On Efficient Attention Mechanism

Posted on:2024-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:Z X XuFull Text:PDF
GTID:2558307136996169Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Face super-resolution(FSR)refers to a technology that uses low-resolution face image reconstruction to obtain high-resolution face images.Different from general image super-resolution,the core goal of FSR is to reconstruct as much as possible the missing facial structure information(ie,the shape of facial features and facial contours)in low-resolution face images.Although these structures make up only a small portion of the face,they are key to distinguishing different faces.Compared with other background regions in face images,human facial features and contours are usually more difficult to recover because they tend to span larger regions and require more global information.This paper conducts research on face super-resolution based on efficient attention mechanism,focusing on how to fully extract and utilize local features and global feature information of face images to reconstruct high-quality face images.In this regard,this thesis mainly proposes the following three solutions:(1)Previous methods mostly restore face image details lost due to degradation through deep learning methods combined with face priors.Considering the need for additional manual labelling of the face prior information and the limited receptive field size of the convolutional neural network(CNN),this paper proposes a Transformer-based efficient attention mechanism face super-resolution network(Efficient Attention Mechanisms Network,EAMNet).EAMNet combines CNN’s local feature information extraction ability and Transformer’s excellent global modelling ability,and can effectively restore the global structure and local texture details of the face without the assistance of any prior information.EAMNet uses post-sampling to enlarge the image,which greatly reduces the number of parameters and calculations,which is much smaller than the previous pre-sampling method.A large number of experiments on mainstream data sets such as Celeb A and Helen have shown that the network can achieve ideal results.(2)Although the post-sampling method will bring a certain degree of efficiency to the network,it will cause the network to always perform feature extraction in a low-dimensional space,which will inevitably cause the loss of high-dimensional feature information.In view of this deficiency,this paper proposes an Efficient Bilateral Symmetrical Network(EBSNet)with an Encoder-Decoder structure to perform face super-resolution tasks.The encoding stage is responsible for extracting the high-level semantic information of the face image,the bottleneck stage is responsible for strengthening the local feature information,and the decoding stage is responsible for gradually upsampling the reduced feature map to improve the geometric shape of each part of the face.The problem of the loss of face image details is caused by shrinking the face image during the encoding process.Experimental results on Celeb A and Helen datasets demonstrate the effectiveness of the network.(3)The traditional encoding-decoding structure usually adopts a simple concatenation method,but it still cannot make full use of the low-level features,and the low-level features cannot fully and effectively guide the learning of high-level features,resulting in the performance of the superresolution task is always not ideal.In response to this problem,this paper proposes a face superresolution network based on multi-scale CNN-Transformer cooperation(A CNN-Transformer Cooperation Network,CTCNet).In order to make full use of the multi-scale features extracted in the encoding stage,CTCNet introduces a multi-scale feature fusion scheme in the decoding stage,so that the network has better feature propagation and representation capabilities.The main goal of designing multi-scale feature interactions is to explore and utilize features in the encoding stage during the encoding process.
Keywords/Search Tags:Face Super-resolution, Attention Mechanism, Transformer, Convolutional Neural Networks, Adversarial Generative Network
PDF Full Text Request
Related items