Research On Cross Modal Text Generation Image Based On Generative Adversarial Network

Posted on:2024-04-29

Degree:Master

Type:Thesis

Country:China

Candidate:C Hu

Full Text:PDF

GTID:2568307067963279

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Deep learning technology has made significant progress in multiple computer vision tasks in recent years,including text image generation.This task also involves text modality,which is combined with the field of natural language processing.The purpose is to generate text matched,high fidelity images from a given text description.With the impressive performance of generative adversarial networks in image generation tasks,they have gradually become the mainstream solution for text generation image tasks.However,due to the limitations of generating adversarial networks themselves,there are problems such as training difficulties and mode collapse;At the same time,there is a huge semantic gap between text and image modalities,which leads to difficulties in joint distribution learning and insufficient fusion,ultimately resulting in low quality and semantic deviation of the generated images.Based on existing problems,this article proposes two text generation image algorithms based on stacked network generation adversarial networks.The summary is as follows:This paper proposes a Shuffle Attention Generative Adversarial Networks(SA-GAN)that integrates unordered attention to address the issue of low color brightness and lack of correlation between RGB channels with text image concatenation features in synthesized images.The lightweight unordered attention is introduced into the generator’s text and image feature concatenation to seek channel correlation between text and image vectors.Secondly,the perceptual loss extracted from the first 35 activation layers of VGG19 is introduced as an auxiliary constraint to improve the true perception of the image.The experimental results showed that on the CUB dataset,the IS value reached 4.02,which increased by 8.6% and 0.49%compared to the Stack GAN-v1 and Stack GAN-v2 models,respectively.The FID value reached 47.31,which increased by 8.8% compared to the Stack GAN-v1 model.On the Oxford dataset,the IS value reached 3.07,which increased by 4.3% and 5.8%compared to the Stack GAN-v1 and Stack GAN-v2 models,respectively.The FID value reached 49.46,an increase of 10.5% compared to the Stack GAN-v1 model.In response to the issue of not paying attention to the weight differences of different semantic words and the need for further improvement in text image fusion methods,this paper proposes a Memory Gate Attn GAN(MG-attn GAN)that integrates memory gate attention modules,continuing to use algorithm one’s unordered attention and perceptual loss.Using a three stage stacked structure generator to increase image resolution to 256×256.Design a memory gate attention module to initialize the weights of different attribute words,and perform similarity matching and fusion with image sub regions.Apply spectral normalization to the discriminator to stabilize its density ratio in high-dimensional space,thereby stabilizing the training of GAN.The experimental results showed that on the CUB dataset,the IS value reached 4.61,which increased by 5.7% compared to the baseline model Attn GAN.The FID value reached 19.58,an increase of 18.3% compared to Attn GAN...

Keywords/Search Tags:

Generative Adversarial Network, Semantic Gap, Shuffle Attention, Memory Gate Attention, Deep fusion

PDF Full Text Request

Related items

1	Research On Non-local Attention Fusion Method For Infrared And Visible Images
2	Image Semantic Segmentation Based On Generative Adversarial Networks And Self-Attention Mechanism
3	Research On Image Fusion Method Based On Generative Adversarial Network
4	Research On Deep Learning Algorithm For Sequence Data
5	Design And Implementation Of Video Character Lip Modification System Based On Generative Adversarial Network
6	Research On Image Super-Resolution Reconstruction Algorithm Based On Improved Generative Adversarial Network
7	Research On Generative Face Detection Algorithm Based On Deep Neural Networ
8	Research Of Infrared And Visible Image Fusion Algorithm Based On Generative Adversarial Network
9	Research On Multi-stage Text-to-image Synthesis Method Based On Generative Adversarial Network
10	Research On Image Transfer Algorithm Based On Generative Adversarial Network