Research On The Method Of Code Summary Generation Based On Deep Learning

Posted on:2024-08-06

Degree:Master

Type:Thesis

Country:China

Candidate:X W Li

Full Text:PDF

GTID:2568307151460474

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the continuous iteration of product functions and the continuous improvement of software scale,source code has become increasingly difficult to maintain,and code comments,as the most direct semantic interpretation of source code snippets,play a crucial role in software maintenance.Code summary generation method refers to the method of automatically generating high-quality natural language comments that reflect the functionality of the source code and the programmer’s intentions for a given code snippet.Code summary generation methods have become a research hotspot in the field of intelligent software engineering.However,there are still many problems with the current code summary generation methods,such as insufficient understanding of source code semantics,out of vocabulary,and so on.In response to these challenges,this paper focuses on how to combine auxiliary tasks and abstract syntax trees to improve the effectiveness of code summary generation.Firstly,in response to the issue of ignoring the relationship between code and comments in code semantic understanding,this paper proposes an enhanced code summary generation method that combines retrieval tasks.Based on an end-to-end architecture,this method combines the code retrieval task as an auxiliary task of the code summary generation task to learn a shared code representation.The purpose is to use the pattern of embedding code and summary into the unified high-dimensional vector space in the code retrieval task to enable the code representation to have the ability to learn the unified relationship between code and comment,and thus improve the accuracy of summary generation.Secondly,to address the issue of underutilization of code structure information,this paper proposes a code summary generation method that integrates abstract syntax tree node features.This method combines the code sequence and abstract syntax tree node features as the final semantic representation of the source code.When extracting abstract syntax tree features,not only the adjacency features of nodes are considered,but also the position and degree information of nodes are taken into account.The aim is to enrich the representation of source code by integrating node features,thereby enhancing the understanding of source code semantics.Finally,experiments were conducted on publicly available Java and Python datasets to validate the effectiveness of the proposed methods.The experimental results showed that the proposed methods outperformed the baseline methods in all metrics,confirming that combining similar tasks and code structure information can effectively enhance the representation learning of source code and improve the effectiveness of summary generation.

Keywords/Search Tags:

code summary generation, deep learning, code retrieval, abstract syntax tree

PDF Full Text Request

Related items

1	Research On Code Summarization Generation Method Based On Deep Learning
2	Automatically Based On The Abstract Syntax Tree And Static Analysis Of The Cloned Code Refactoring
3	Optimization Of Deep Code Repair Model Based On Grammar Rules
4	Design And Implementation Of Abstract Syntax Tree Based Code Defect Detection
5	Automatic Generation Of Code Comments Combining Tree2Seq And Attention Mechanism
6	Research And Application On Code Annotation Generation Based On Seq2seq
7	Pyreview:A Python Source Code Analysis Tool Based On Abstract Syntax Tree Differencing Algorithm
8	Research And Implementation Of Code Clone Detection Technology Based On Deep Learning
9	Research On Source Code Plagiarism Detection Based On Abstract Syntax Tree
10	Development Of Static Code Defect Detection Tool Based On Abstract Syntax Tree