With the rapid development of information technology,the same kind of information can be presented in different ways to form images of various domains.The image information of different domains can complement each other and provide richer visual experience for people’s life.Therefore,cross-domain image translation algorithms have been intensively studied in the field of computer vision.Cross-domain image translation algorithm can be divided into supervised and unsupervised modes according to the presence or absence of paired data.Each mode can ensure the integrity of the translated information and improve the beauty of the translated images.Among them,the cross-domain image translation algorithm based on Generative Adversarial Network uses generator and discriminator to play with each other and finally achieves Nash equilibrium,which brings a novel approach to this field and becomes the most widely used method.Therefore,the cross-domain image translation algorithm based on Generative Adversarial Network has become one of the research hotspots in the field of computer vision with important theoretical and practical significance.However,two important challenges of the study remain as follows.(1)In the presence of paired data,the conventional supervised translation algorithm does not consider the differences in information contained between cross-domain images and the differences in features under different mapping ranges,resulting in blurred contours,rough textures,and false coloring in the generated images.(2)In the absence of paired data,most unsupervised translation algorithms optimize the overall model by cyclic reconstruction only without distinguishing the information differences between the source and target domain images,resulting in a mixture of common and unique information between the two,and huge differences in the feature patterns of the cross-domain images.To address the above two challenges,this thesis proposes a cross-domain image translation algorithm based on Generative Adversarial Network to improve the image translation quality,and the main research contents and innovations are as follows:(1)Considering the feature-level information fusion,this thesis proposes a parallel fusion and multi-scale discrimination model for supervised image-to-image translation.First,for paired data,supervised image translation methods usually do not consider the information differences between cross-domain images,which leads to blurred contours in the generated images.To address this problem,this thesis proposes a parallel feature fusion generator that uses a two-branch fusion structure to connect cross-domain images.Among them,the main branch extracts the deep features of the source domain images and completes the feature translation to the target domain,while the subsidiary branch is used to extract the contour information of the source domain images and combine it with the target domain features by jump connection,which improves the overall images contour clarity.Secondly,image features vary greatly under different mapping ranges.Considering only the realism of single size features as the discriminative result will lead to rough texture of the generated images.To solve this problem,this thesis utilizes a multi-scale discriminator to discriminate the different scale features of the generated images in an integrated manner to enrich the local and global texture features of the images.In addition,the paired data are only used for pixel-level constraint with the generated images,and this approach can only guarantee the similarity of the generated images with its style,and does not guarantee the color fidelity.Therefore,this thesis proposes chromatic aberration loss,which further emphasizes the image color information using Gaussian fuzzy convolution and enhances the realism of the generated images.By using feature-level information fusion,our method achieves fine subjective and objective evaluation indexes,clear and real visualization results on two cross-domain supervised datasets with large information difference.(2)Considering the image-level information fusion,this thesis proposes a semantic cooperative shape perception model for unsupervised image-to-image translation.In the absence of paired data,unsupervised image translation algorithms often use symmetric two-group Generative Adversarial Network to complete cross-domain image translation,and mingle information from different domains to obtain the generated images.In the target-level image translation,the texture chaos of front and back scenes is easy to occur,which affects the overall visual effect of the generated images.Therefore,this thesis proposes a texturesemantic co-generator,in which the unique texture generator extracts the texture information of the target domain and the shared semantic generator extracts the semantic information of the source domain.Through feature fusion,the source domain images,style images and semantic images are combined to generate images based on the idea of front-back view separation.In addition,there is no corresponding paired images in the unsupervised approach.To prevent large differences in the feature morphology between the source domain images and the generated images,this thesis proposes a shape perception loss to constrain the semantic information from different domain images and enhance the overall perception of the semantic images by the network.By using the image-level information fusion method,the visualization results with clear texture and favorable evaluation indexes are obtained on two supervised and four unsupervised datasets. |