Research On Generative Model Theory And Application

Posted on:2021-02-21

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Gan

Full Text:PDF

GTID:1368330626955756

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The famous physicist Richard Feynman said,”What I cannot create,I do not understand.” If Artificial Intelligence(AI)can learn to create or generate real images,it will help AI understand the real world.However,how to make AI learn to create or generate images that are not real is a great challenge.In order to cope with this challenge,the generative model gradually stands out among many AI algorithms,because it is able to effectively learn the real distribution,which attracts researchers’ attention.In many application fields,such as image synthesis,text generation and video generation,the generative model has achieved fruitful results.This dissertation focuses on the theory of generative model and several key issues in image synthesis.It firstly includes improving the recognition ability of discriminator and stabilizing the training process of generative adversarial networks(GANs)to obtain better noise-to-image synthesis.Then,it includes guiding GANs to learn more information to get better image translation effect.Moreover,avoiding the discriminator falling into the local sub-optimal state too early to obtain better text-to-image synthesis is discussed.Five related researches have been carried out,as follows:(1)For addressing the problem that the recognition ability of discriminator is limted and the model training is unstable in the noise-to-image synthesis task,this dissertation proposes an image synthesis method based on sample augmentation and conditional constraint.Firstly,in order to improve the discriminator’s discriminant ability during training,this dissertation designs a hybrid augmented discriminator combining real samples and fake samples.Secondly,in order to reduce the influence of the instability caused by the noise that cases a sudden change of generator’s loss value,a penalty of ill-conditioned number is imposed on generator.Finally,the proposed method is applied to three different loss functions to verify the generalization of the hybrid augmented discriminator and penalty.Experimental results show that the method improves the discriminator’s discriminant ability and stabilizes the training process by mixing samples and ill-conditioned number penalty,and obtains better generated images.(2)For dealing with the problem that the generator is not robust enough to the input noise and the discriminator’s discriminating ability gradually decreases for the noise-toimage synthesis,this dissertation proposes an image synthesis method based on denoising auto-encoder constraint.First of all,in order to improve the robustness of the generator to the input noise,a new generator constraint is proposed,which consists of the F norm of the difference between the image encoding generated by the perturbed noise and the input noise.Then,in order to prevent the discrimination ability of the discriminator from decreasing gradually in the training process,a new sample augmented discriminator is designed in this dissertation.It combines the images generated by the input noise and the corresponding perturbed noise with the real image to complete the training.Finally,the denoising penalty and sample augmented discriminator are applied to five different models to verify the scalability of the proposed method.Experimental results show that the proposed method improves the quality of the image generated by the baseline models through denoising penalty and sample augmented discriminator.(3)For handling with the problem that some input noises and generated samples make generator and discriminator training unstable in the training process of noise-toimage synthesis and affect the quality of the generated images,this dissertation proposes an image synthesis method based on auxiliary network regulation.Firstly,in order to reduce the instability of generator and discriminator in the training process respectively,this dissertation introduces auxiliary noise as input and designs a learnable auxiliary module.Secondly,in order to train the learnable auxiliary module together with GANs,a learnable auxiliary penalty and a learnable auxiliary discriminator are designed to constrain the generator and improve the stability of discriminator,respectively.The proposed method is then applied to Hinge and LSGANs loss functions to verify its extensibility.Experimental results show that this method can improve the stability of GANs training and the performance of baseline models to different degrees.(4)For solving the problem that the detail content of the generated image is lost due to the lack of auxiliary information generated by the guidance in the image translation task,this dissertation proposes a multi-constraints image translation method with additional auxiliary domain.Firstly,in order to guide the generator to learn more about the details of the target domain image,this dissertation adds a similar auxiliary domain to guide the generator.Then,in order to overcome the problem that the model mapping space is too large,a cycle consistency loss function containing three domains is designed.Finally,in order to make the model training more stable,a multi-scales and multi-levels discriminator is designed.Experimental results show that this method enriches the detail content of the generated images by adding similar auxiliary domain and using multiple constraints,and improves the quality of the generated images.(5)For handing the problem that discriminator tends to fall into the local sub-optimal state too early in the text-to-image synthesis,resulting in the poor quality of generated images,this dissertation proposes a text-to-image synthesis method.Firstly,to prevent the discriminator from falling into the local suboptimal state prematurely,this dissertation designs a novel hybrid loss augmented discriminator.Secondly,to reduce the sensitivity of the discriminator’s classification,and make it pay attention to semantic and structural changes,this dissertation adds the loss value of fake sample(real sample)to the loss value of real sample(fake sample)to train it.In the process of adopting Adam optimization,the loss value of mixed real and fake samples enhances the signal transmission.It perturbs the updating of discriminator parameters and prevents the discriminator from falling into local suboptimal state prematurely.Then,the hybrid loss augmented discriminator is applied to two types of text-to-image synthesis tasks to verify the scalability of the hybrid loss augmented discriminator.Experimental results show that this method can avoid the discriminator falling into local suboptimal and improve the performance of the existing models.To sum up,this dissertation has focused on generative model theory and its application in image synthesis.Then,five different methods are proposed and they are successfully applied to noise-to-image synthesis,image translation and text-to-image synthesis.It has certain theoretical value and application value.

Keywords/Search Tags:

generative model, generative adversarial networks(GANs), variational auto-encoders(VAE), image synthesis

PDF Full Text Request

Related items

1	Image Generation Based On Generative Adversarial Networks
2	Research Of Image Feature Disentanglement And Multi-attribute Editing Based On Conditional Generative Adversarial Networks
3	Image Synthesis Based On Generative Adversarial Networks
4	Research On Conditional Generative Adversarial Networks Model Based On VAE
5	Research On Person And Facial Image Synthesis Algorithm Based On Generative Adversarial Networks
6	Research Image Style Transfer Algorithm Based On Generative Adversarial Networks
7	Research On Auto-encoders And Generative Adversarial Network Based Speech Enhancement
8	Research On Image Synthesis Methods Based On Generative Adversarial Networks
9	The Research Of Face Frontalization Based On Generative Models
10	Research On Many-to-Many Voice Conversion Based On I-vector,Variational Auto-encoder And Generative Adversarial Networks For Non-parallel Corpora