| Image editing tasks involve modifying existing images or creating new ones based on user requirements.In recent years,deep learning techniques have automated image editing,resulting in improved results and increased efficiency.However,existing algorithms often encounter problems with details such as inconsistent image shapes and missing or semantically mismatched features,which affect the quality of image generation.To address these challenges,this thesis proposes a Controllable Instance-Aware Generative Adversarial Network(CIA-GAN)for controllable instance-aware image generation.The proposed method integrates target and source masks to specify the required shape for instance transformation.A mask union module resizes and aligns the two masks from different domains,while an orientation consistency estimation module and a rotation module ensure that the source and target masks have the same orientation.The suggested method uses a joint mask to determine the coarse region for target instance generation,which is encoded as a latent code and fed to an instance-aware generation module to learn the instance generation process under the assumption of cyclic consistency.However,proposed approach only generates shape-controllable images and lacks diversity and detail expression.To address this,this thesis proposes Conditional Semantic Augmentation Generation Adversarial Networks(CA-GAN).The proposed method uses conditional augmentation to process the text encoding,and the features are extracted from the generator intermediate layer.The text encoding is sent to two perceptrons for processing and fused with the generated mask to improve detail representation.To verify the controllability and quality of the generated images,this thesis conducts quantitative and qualitative analyses and user studies on different datasets.Various evaluation metrics,such as LPIPS and FID,are used to assess perceptual similarity and natural realism.Qualitative analysis includes visualization of the generated images,and a questionnaire is designed to investigate whether the generated results meet user requirements.The results show that the proposed methods are better than previous works,generating more controllable and diverse images while optimizing subject feature representation in the image generation process. |