| Integrated development environments(IDEs)provide a series of helpful services to speed up the software development process,especially its powerful code completion.In order to improve the efficiency of software developers,researchers have proposed code completion methods based on the information provided by the compiler and based on heuristic rules.However,the former limits the IDE’s ability to perform code completion in dynamic languages,while the latter requires a lot of time to modify and verify the rules when programming languages,frameworks,or APIs change.In addition,the existing code completion component only considers the completion of a single token,ignoring the completion of multiple tokens.In order to better perform code completion,we conduct research on single-token code completion and multi-token code completion.Aiming at the long dependencies and Out-of-Vocabulary(OOV)problems in single-token code completion,we propose code completion methods based on Transformer and anonymization.First,the code is represented in the form of an abstract syntax tree to utilize the structural information of the code;then,based on Transformer-XL,the processing ability of long dependencies is improved through a non-anonymized model;finally,anonymize the code,learn and predict the anonymized code,and then improve the model’s ability to predict OOV vocabulary.Aiming at the problem of multi-token code completion,we propose a code completion method based on the fusion of code sequences and code graphs.First,the problem of multi-token code completion is regarded as a generation problem and solved by a neural network model based on an encoder-decoder architecture;then,the abstract syntax tree of the code is further enhanced and represented in the form of a graph,to further improve the ability to understand the code;finally,through the fusion of code sequences and code graphs to improve the ability of multi-token code completion.Experiments show that our single-token code completion method can effectively solve long-term dependencies and OOV problems,and our multi-token code completion method can effectively understand the code content and make code completions. |