| With the continuous iteration of product functions and the continuous improvement of software scale,source code has become increasingly difficult to maintain,and code comments,as the most direct semantic interpretation of source code snippets,play a crucial role in software maintenance.Code summary generation method refers to the method of automatically generating high-quality natural language comments that reflect the functionality of the source code and the programmer’s intentions for a given code snippet.Code summary generation methods have become a research hotspot in the field of intelligent software engineering.However,there are still many problems with the current code summary generation methods,such as insufficient understanding of source code semantics,out of vocabulary,and so on.In response to these challenges,this paper focuses on how to combine auxiliary tasks and abstract syntax trees to improve the effectiveness of code summary generation.Firstly,in response to the issue of ignoring the relationship between code and comments in code semantic understanding,this paper proposes an enhanced code summary generation method that combines retrieval tasks.Based on an end-to-end architecture,this method combines the code retrieval task as an auxiliary task of the code summary generation task to learn a shared code representation.The purpose is to use the pattern of embedding code and summary into the unified high-dimensional vector space in the code retrieval task to enable the code representation to have the ability to learn the unified relationship between code and comment,and thus improve the accuracy of summary generation.Secondly,to address the issue of underutilization of code structure information,this paper proposes a code summary generation method that integrates abstract syntax tree node features.This method combines the code sequence and abstract syntax tree node features as the final semantic representation of the source code.When extracting abstract syntax tree features,not only the adjacency features of nodes are considered,but also the position and degree information of nodes are taken into account.The aim is to enrich the representation of source code by integrating node features,thereby enhancing the understanding of source code semantics.Finally,experiments were conducted on publicly available Java and Python datasets to validate the effectiveness of the proposed methods.The experimental results showed that the proposed methods outperformed the baseline methods in all metrics,confirming that combining similar tasks and code structure information can effectively enhance the representation learning of source code and improve the effectiveness of summary generation. |