Research On Code Generation Method With Enhanced Program Syntax Structure Representation

Posted on:2024-07-21

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2568307058977669

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Code generation tasks are a technology of practical importance.Code generation technology can assist programmers in their development work.In integrated development environments,code generation technology can be embedded in the form of plug-ins that prompt programmers to perform code writing work.By automating the process of code generation,it can reduce programmers’ development time and effort while improving the quality and maintainability of the code.The code generation approach is inspired by text generation methods in natural language processing,such as machine translation tasks,which treat code generation as a text generation task.This treatment ignores the difference between code and text.Code has a complex syntactic structure,yet the syntactic information of code is not considered in the process of code generation by large models.In this paper,we focus on the use of code syntax structure in code generation and solve the following problems:(1)Insufficient extraction of code information.There are two modes of code generation,one is to generate code tokens directly,and the other is to generate an abstract syntax tree of the code.The problem with these two generation modes is that they do not make use of the structural information of the code,such as the tree structure of the abstract syntax tree.(2)The problem of inadequate integration of code structure information.The existing methods use a simple splicing approach for the fusion of multiple modalities of the code,ignoring the direct fusion weighting of the modalities.(3)The problem of inadequate natural language semantic understanding.Existing approaches treat text as sequential structure,which makes it difficult to align text and code with structural information.Text has structural information,such as sentence parse trees.It is a challenge to make use of the structural information of the text so that the code structural information is aligned with the text structural information.In this work,two methods of code generation based on program syntax structure are proposed to address the above problem.(1)Feedback network-based code generation method.This paper proposes a feedback network to solve the problem of incorporating structural information in the code generation process,by parsing the generated part of the code into a code combination graph with syntactic structure during code generation,and incorporating syntactic information for the model through the attention fusion mechanism to improve the quality of the model-generated code.(2)Code generation method based on graph structure comparison learning.In order to solve the problem that code and natural language descriptions are not aligned,this paper extracts both code and natural language descriptions as grammar trees,and aligns code and natural language in grammar space through comparative learning.After the pre-training operation of grammar space alignment,the model has a boosting effect on the metrics of code generation tasks.

Keywords/Search Tags:

Code generation, graph neural networks, pre-trained models, program syntax structure

PDF Full Text Request

Related items

1	Structure-aware Graph Neural Network For Code Comment Generation
2	Pre-training For Program Understanding And Generation
3	Research And Implementation Of Automated Program Repair Method Based On Pre-trained Model
4	Design And Implementation Of A Database Operation Code Generation Module Based On Pre-trained Model
5	Eventic Graphs Construction And Application Methods For Textual Event Prediction
6	Research On Code Comment Generation Mettod Based On Neural Network
7	Research On Code Analysis Method Based On Neural Network Language Model
8	Research And Application On Code Annotation Generation Based On Seq2seq
9	Research On Reading Comprehension Method Based On Graph Neural Network
10	Research On Multiple-choice Machine Reading Comprehension Based On Graph Convolutional Neural Networks