Font Size: a A A

Research On Code Generation Method Using Multi-Source Information

Posted on:2021-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:M Y ShiFull Text:PDF
GTID:2428330647458919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Code generation is a new semantic parsing task in the field of artificial intelligence,which aims to generate computer-executable code fragments that implement semantic function based on natural language description information.At present,the research of the code generation task is still in its infancy,and the existing research work mainly uses the typical Encoder-Decoder framework to achieve the goal of code generation.However,most of the existing research on code generation tasks ignore the full use of the information of API documents and natural language syntax and semantic structure.Therefore,this thesis conducts an experimental study on how to integrate API information and grammatical semantic structure information in the code generation task.The work of this thesis mainly includes the following three aspects:(1)Combing the current research status of the data sets and code generation models used in existing code generation tasks,the existing code generation research methods are summarized into two different types: Seq2Seq-based methods and Seq2Tree-based methods,and focus on the analysis and comparison of different code generation task models on the different use of abstract syntax framework,pointing out the deficiencies of the current research.(2)Design and implement a code generation task model that integrates API information.The system adopts the method of Seq2 Actions and combines the attention mechanism to implement an end-to-end neural network model.First,it calculates the function name description information similar to the natural language description information,then selects the similar function name candidate set;The attention mechanism model disambiguates the function name candidate set,and applies the obtained function name candidate set to a code generation system based on the pointer generation network.Experimental results show that the code generation model used with API information is superior to the baseline system.(3)Design and implement a code generation model that combines grammar and semantic structure.The model improves the baseline system and enhances the semantic expression by using the grammar and semantic structure information of natural language description information to obtain a better semantic representation of natural language,therefore obtaining the alignment relationship between natural language and code fragments.The system elaborated two aspects of grammar and semantic information,namely natural language grammar structure tree and natural language semantic role labeling.Then,the extended attention mechanism is used to model the structural information of natural language.Experimental results show that the code generation model that combines grammatical and semantic structure information is superior to the baseline system.
Keywords/Search Tags:Code Generation, Deep Learning, Attention
PDF Full Text Request
Related items