| Function name prediction is an important downstream task in code analysis.An excellent function name can increase the intelligibility of a program or code and help developers easily understand the code of others,which is essential for the expansion and maintenance of software products.In recent years,researchers have proposed a large number of different function name prediction models.With the development of machine learning,function name prediction methods have gradually changed from traditional code analysis to deep learning code representation.Various machine learning based function names predicting tools are endless.However,there are still two problems in using machine learning models to complete function name prediction tasks:First,the function name prediction task that spans different projects cannot be completed well.The common method can only complete the prediction task under the same project;the second is the limitation of the function name library,which leads to a low accuracy of prediction by various methods.Therefore,this topic proposes a function name prediction model that spans different projects based on a large functional corpus.In this topic,we first proposed a method for extracting large-scale functional corpus from the Git open source repository.Through the function extraction tool we designed,we extracted all the functions in the open source projects that meet the conditions,and then after data cleaning and function filtering,we built a large function corpus.Then we use the Skip-gram model of natural language processing direction to complete the vector pre-training task of code Token.In order to express the code well with vector,which is code2 vec task,we proposed the AttBiLSTM model based on the function name supervision training code vector.At the same time,in order to accelerate model training and improve the accuracy of prediction,we use the TF-IDF algorithm to analyze the key tokens in the code,and establish a set of candidate function names for different tokens,we improve both the model efficiency and the prediction accuracy.Finally,we conducted a full experimental comparison based on the extracted largescale functional corpus.Experimental results show that in the function name prediction task,our prediction method is superior to other advanced methods in terms of model efficiency and prediction accuracy. |