Font Size: a A A

Research On Text-guided Face Generation And Editing Based On Memory Network And Semantic Modulation

Posted on:2024-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:S L JiangFull Text:PDF
GTID:2568307130453004Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Text-guided face image generation and editing belongs to the intersection of natural language processing and computer vision.The task generates or edits face images by user-given text descriptions,which has the advantages of simple operation and user-friendly,and has important research significance and application value in many industries,such as criminal investigation,dating and entertainment,face recognition,medical beauty,etc.With the development of deep learning,some progress has been made in text-guided face image generation and editing.Nevertheless,because the text description is more complex and abstract,and belongs to two different modalities with the face image,the task still has many problems and challenges,for example,the text-guided face image generation suffers from distorted or missing faces,low semantic consistency and similarity,and the text-guided face image editing also suffers from low editing accuracy,poor information decoupling and identity preservation.To address the above problems and challenges,this thesis conducts research in two directions of text-guided face image generation and editing based on image generation techniques,and the specific research work is summarized as follows:(1)A fine-grained text-guided face image generation method based on multi-modal memory networks is proposed to improve the quality of generated face images,the semantic consistency between face images and text descriptions,and the similarity between face images and real images.Firstly,the proposed multi-modal memory network can determine the importance of different words according to the features of different stages of the face image,so that the generator can better focus on the generated face region,so as to effectively solve the low quality problems such as distortion or missing of the generated face image.Meanwhile,the proposed word-level discriminator can provide fine-grained training feedback to the generator and help the generator establish accurate connections between complex text descriptions and face attributes to enhance the robustness of the generator and improve the semantic consistency between generated face images and text descriptions,and the similarity between face images and real images.Finally,the effectiveness of this method is verified by a large number of comparison experiments and ablation experiments.(2)A text-guided face image editing method based on information decoupling and semantic modulation is proposed to ensure that face identity and irrelevant attributes remain unchanged and to improve editing accuracy.Information decoupling explicitly separates face attribute information of different semantic levels in the text description and injects them into different sub-mappers according to the semantic levels they match to improve the information decoupling capability of the model and prevent irrelevant attributes from changing in the subsequent editing process.In addition,the face attribute mapper containing four semantic modulation networks modulates and gradually aligns the face source latent code according to the text features to improve the accuracy of text-guided face image editing.Finally,the loss of attribute retention correlation is introduced during training to ensure that face identity and irrelevant attributes remain unchanged.A large number of comparative experiments and ablation experiments have verified the effectiveness of this method.(3)The text-guided face image generation and editing system is designed and implemented.The main functional modules of the system contain: data input module,data storage module,face generation module and face editing module.After the study,it was shown that the designed system has some application value.
Keywords/Search Tags:Text description, Face image generation, Face image editing, Identity preservation
PDF Full Text Request
Related items