Font Size: a A A

Research On Decomposition Of Natural Language Programming Task Based On Deep Learning

Posted on:2019-11-09Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LiuFull Text:PDF
GTID:2428330611493343Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of technologies such as mutual artificial intelligence and deep learning,more and more open source communities and open source software have emerged on the Internet,including millions of lines of code.The emergence of these code resources brings new opportunities and challenges to traditional software engineering.The rational use of these code resources can greatly improve the quality and efficiency of software development.At present,there is a large amount of research work in this direction.This topic focuses on the code generation technology that generates code automatically.However,these technologies still have certain limitations.Automatic code generation technology usually cannot generate large-scale and complex programs.Code search technology is limited by the search space and may not be able to search for suitable code segments.Therefore,this paper proposes Lego,a deep learning-based task decomposition tool that can decompose a given high-level programming task into multiple low-level subtasks.In the process of analyzing Java code,we find that there are two types of comments in source code,one is high-level intent and can be regarded as a task,while the other is low-level intent and can be regarded as a subtask.Therefore,we build a task decomposition data set based on these two kinds of comments,which contains data like <tasks,[subtask 1,...,subtask N]>.Lego adopts deep learning technology to learn from the data set and establish task decomposition model.Lego's main work includes:(1)comment quality judgment;there are some low-quality comments in the Java source code that cannot correctly interpret subsequent code segments.The existence of such comments affects the effectiveness of the entire work,so we propose a supervised comment quality classifier for identifying low-quality comments.(2)sub-comment comments generation;in the process of extracting sub-comments from Java source code,we found that most of the Java code does not contain or lacks subcomments,which leads to the number of dataset to be too little.Therefore,for those code segments that lack segmentation comments,this paper proposes a sub-comment generation method to generate sub-comments by using the code summary generation model.Experiments shows that,the F-score of the comment quality classifier reached 92.78%;the BLEU score of the code summary generation model reached 30.78;Lego's BLEU score on the task decomposition reached 20.19.And the validity of a comment quality classifier and sub-comment comments generation is evaluated by the control variable method.After adding the supervised comment quality classifier,Lego's BLEU score is increased by 1.32;and after adding the data generated by the sub-comment comments generation method,Lego's BLEU score Increased by 7.6.
Keywords/Search Tags:task decomposition, deep learning, code summary generation, comment quality
PDF Full Text Request
Related items