| With the rapid development of artificial intelligence-related algorithms,which have made it possible for computers to automatically generate text that conforms to language rules and is semantically fluent,many text generation tasks have mature technical systems.In the process of software development,code comments are very important,which can greatly improve the development efficiency of post-maintenance staff and reduce the maintenance cost of the project.However,developers often spend a lot of time on the specific implementation of logical functions at the beginning of a project,and due to tight project time,a large number of annotation text is missing.Various neural network-based annotation generation models have been proposed by related researchers,but these models still have a lot of room for improvement in terms of accuracy and fluency.The thesis considers code annotation generation as a text translation task and carry out a series of studies on annotation generation for Java method codes based on Sequence-toSequence(Seq2Seq)model,the main contents are as follows:1.Presents an annotation generation model,CSE-GC,based on the Seq2 Seq model.The encoder of this model is composed of a Gate Recurrent Unit(GRU)and Convolutional Neural Networks(CNN),which respectively extract the structural and semantic information of the code language.Additionally,the Abstract Syntax Tree(AST)is an abstract representation of the syntax structure of Java source code,where each node on the tree corresponds to a structure in the source code.To make it easier for encoders to obtain code structure information,this thesis proposes a Code Structure Enhancement(CSE)traversal method for abstract syntax trees.The effectiveness of the annotation generation model and the traversal method CSE is validated through a comprehensive experimental analysis conducted on the same dataset,comparing it with other advanced annotation generation models.The results demonstrate a significant improvement in both BLEU-4 and METEOR metrics for CSE-GC,achieving 45.01% and 30.95% respectively.This indicates the strong performance and effectiveness of CSE-GC in generating annotations.2.The Seq2Seq-based annotation generation model CSE-GC has a fatal drawback that the generated annotated text is often of poor quality and less robust when there is slight interference at the input side.To enhance the robustness and alleviate the problem of sparse dataset,we propose an annotation generation model architecture GAN-CSE-GC that incorporates Generative Adversarial Networks(GAN).Specifically,this thesis uses CSE-GC as the generation model of GAN-CSE-GC and designs a CNN type network as a discriminative model.Additionally,this thesis proposes a code noise data generation method,which involves inputting the constructed noise data and the real data into GANCSE-GC to enable adversarial training of the network.The experimental results demonstrate that GAN-CSE-GC achieves a 0.6% improvement in BLEU-4 and a 1.14%improvement in METEOR compared to CSE-GC when dealing with noisy data.These findings effectively enhance the robustness of CSE-GC in handling such data challenges. |