Research On Image Data Augmentation Based On Generative Adversarial Network

Posted on:2023-08-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y Yang

Full Text:PDF

GTID:2568306791496194

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The three major elements of Artificial Intelligence(AI)are data,computing power and algorithms.As the basis of AI,data is extremely important in all downstream tasks.As a branch of AI,Deep Learning(DL)also needs a certain amount of data when training some models.The number and quality of datas will have different effects on neural networks.If data is scarce and the size of data does not reach the amount required by neural network during training,the model may not converge or even meet the normal training even if designing the network better or adjusting the hyperparameters.It undoubtedly takes time and labor costs when collecting some required image,text or voice data.It is almost difficult to obtain large amounts of data in astronomy、medical、 security、 aviation、and power system.So how to obtain high-quality datasets efficiently and low cost becomes a research focus,data augmentation is needed to provide need for model.In this paper,data augmentation mainly includes two aspects: one is to expand the number of image data,and the other is to improve the quality of data.Generative Adversarial Networks(GAN)is constantly developing,and both conditional and unconditional generation shows great potential in image data augmentation.However,the existing GAN models often have the problems of unstable training,difficult convergence,mode collapse and low generation image quality,which can not better meet the needs of expanding the image datasets.Therefore,the research goal of this paper is to propose an effective GAN based on image data augmentation to further stabilize the training process and improve the quality of image generation,and then expand the image datasets with effective and higher quality.Based on the above background,data augmentation based on GAN with mixed attention mechanism was proposed to alleviate the unstable and difficult training;Data augmentation based on GAN with spatial pyramid features was proposed to further improve the quality of images.Data augmentation based on GAN with twin normalization was proposed to prevent mode collapse when mode collapse when generating high resolution images.The main research contents of this thesis are as follows:1.Aiming at the problem of training instability and difficult convergence when data augmentation based on GAN,the data augmentation based on GAN with mixed attention mechanism(Mix-Atten-GAN)was proposed.This method introduced the the mixed attention which include self-attention and channel attention.It can stabilize training by associating distant features in images to generate coordinated objects.Experiments show that the proposed method can further improves the training stability while improving the ability of GAN in generating image details through qualitative and quantitative evaluation.And demonstrates the availability of the method from the application level which is classification experiments for validation.In order to exclude the influence of classifier performance and only consider the effect of data augmentation,designing a classification network based on Le Net.Compared with the accuracy of before and after data on the testset using real images augmentation and different GAN augmentation,the experimental results demonstrate the usability of the proposed method.2.Aiming at the problem of the generated image quality needs to be further improved when data augmentation based on GAN,the data augmentation based on GAN with spatial pyramid feature(SPF-GAN)was proposed.This method introduced the spatial pyramid into the generator and discriminator to better capture the edge of images.The spectral normalization was introduced to make the parameter matrix satisfy Lipschz constraints,which stabilize the training process and improve the quality of the generated images.Experiments show that the proposed method can obtain some feature information more comprehensively than other methods,which further improves the quality of the generated images.And the availability of the proposed method is further proved from the application level.3.Aiming at the problem of mode collapse if generating high resolution images when data augmentation based on GAN,the data augmentation based on GAN with twin normalization(Twin-Norm-GAN)was proposed.A large number of training samples are generally needed when generating high resolution images.It is easy to produce mode collapse when high resolution images are generated with few samples which the single sample produced by GAN.This method introduced two normalization methods,namely gradient normalization and attention normalization.Comparied with the state-of-the-art GAN model on the FFHQ、Panda with qualitative and quantitative.Experiments show that the proposed method perform better and can effectively prevent mode collapse while improving the generating power.Finally,the availability of the proposed method is further demonstrated using downstream task experiments.

Keywords/Search Tags:

Data Augmentation, Generative Adversarial Network, Mixed Attention Mechanism, Spatial Pyramid, Twin Normalization

PDF Full Text Request

Related items

1	Research On Image Deblurring Method Based On Generative Adversarial Network
2	The Generative Adversarial Network For Data Augmentation In Pedestrian Detection
3	Research Of Image Data Enhancement Method Based On Generative Adversarial Network
4	Research On Method Of Reconstructing Scene Image From Audio Based On Generative Adversarial Network
5	Research On The Construction Of Complex Generative Adversarial Network
6	Research Of Infrared And Visible Image Fusion Algorithm Based On Generative Adversarial Network
7	Data Augmentation Method Based On Generative Adversarial Network And Its Application
8	Research On Image Deblurring Based On Generative Adversarial Network
9	Research On Expression Synthesis Algorithm Based On Generative Adversarial Networ
10	Research On Generative Adversarial Network For Text-to-Image Synthesis