Research And Implementation Of Automated Program Repair Method Based On Pre-trained Model

Posted on:2024-02-04

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Zhang

Full Text:PDF

GTID:2568306914988239

Subject:Master of Electronic Information (Professional Degree)

Abstract/Summary:

PDF Full Text Request

Automatic program repair is a key technology to improve the efficiency of software defect repair,reduce the difficulty of defect repair and reduce the cost of software maintenance,and it can also free software developers from the complex and tedious work of software operation and maintenance and program defect repair.In recent years,with the in-depth research of deep learning and neural networks,researchers in the field of automatic program repair have gradually shifted their attention from traditional defect repair techniques to automatic program repair methods driven by deep learning techniques.The current mainstream deep learning-based approaches use neural machine translation models to translate defective code into correct code to obtain defective code patches.Although such approaches are able to capture the complex relationships corresponding to the defective code and the correct code and automatically learn the abstract repair templates in the program from historical defect repair data,when the sequence of code input to the model is too long,it is difficult for the model to learn the syntactic,semantic,and structural information between the defective code statements and other statements around them,resulting in lower-than-expected repair results.The characteristics of programming languages,such as long-range semantic dependencies and free variable naming style,lead to inefficiency of serialized characterization methods for learning features,which likewise directly affects the effectiveness of the model in learning code features.Therefore,in this paper,two automatic program repair methods based on pre-trained models are proposed to address some of the above-mentioned limitations in existing methods.The main work of this paper is as follows:1)A BERT-based approach to enhance automatic program repair with contextual semantic information is proposed.In order to better represent defective code,the method uses a pre-trained model BERT to represent source code as sequences,and adds code summaries to the input data of the model as the context of the source code to enhance the learning ability of the model for code semantic information,and also introduces an attention-based Transformer model for generating defective code patches.Empirical evaluation on Defects4J,currently the most popular benchmark test set,shows that the approach can generate compilable patches for 63 defects,of which 41 are correct,outperforming the baseline approach.2)An automatic repair method for information enhancement procedures based on GraphCodeBert is proposed.The method adds data flow graph information to the input data of the model with the code summary already added,and uses GraphCodeBert to represent the multi-source combined input as a sequence.Therefore,when the model performs feature learning,it not only learns more deep semantic information in the code,but also enhances the learning ability for code structure information.Empirical evaluation on Defects4J,currently the most popular benchmarking set,shows that the approach is able to generate compilable patches for 69 defects,of which 46 are correct,outperforming the baseline approach.3)An automatic program repair system has been developed.This system can automatically analyze program defects,which helps software developers or users to quickly understand the defects in the program and help them to finish the subsequent program defect repair work.

Keywords/Search Tags:

Automatic program repair, Neural machine translation, Code representation, Pre-trained models

PDF Full Text Request

Related items

1	Research On Graph-based End-to-End Automated Program Repair Approaches
2	Research On Code Generation Method With Enhanced Program Syntax Structure Representation
3	Study On Automatic Program Repair Method Based On Machine Learning
4	The Design And Implementation Of Human--machine Interactive Patch Review System For Automatic Program Repair
5	Research And Implementation On Uyghur-Chinese Neural Machine Translation
6	Research On Neural Machine Translation Methods Incorporating Pre-trained Language Model Knowledg
7	Automatic Software Repair Method Based On Genetic Programming Optimization
8	Research On Automatic Program Repair Techniques Based On Similarity Metrics
9	Research On Interpretability Of Neural Machine Translation:Model’s Representation,Training And Behavior
10	Research On Quality Estimation Based On Pre-trained Language Models