| The rise and prosperity of open-source software platforms have provided a lot of convenience for the development of software developers.At the same time,the open-source nature of the platform code provides a lot of work to learn and research the foundation.At present,there are many kinds of research based on open-source code in the field of software defect detection,such as detecting code structure,code smells,code style,code writing specification,etc.Through the analysis of the code can be more convenient and fast to find the developer in the development process of the error,playing a role in assisting development.However,the previous research on software defect detection takes the source code body as the main research object and often ignores the function of source code comments with personalized information.Source code comments often hide important information that is difficult for the body of the code to express.Compared with the source code body,the analysis,and study of source code comments are more difficult and challenging.How to start defect detection from source code comments,and whether there is a kind of flag information in source code comments to indicate the existence of defects in the source code.The proposal of technical debt solves this problem well.Technical debt is a trade-off between short-term goals and long-term code quality during software development.Based on technical debt,a concept called self-admitted technical debt is put forward.This is a technical debt intentionally introduced by the developer and marked in code comments.Studies have shown that most technical debt is the result of deliberate trade-offs by developers,not sloppy programming.In other words,self-admitted technical debt can be used as indicative information of technical debt.Further research has shown that the technical debt that developers intentionally introduce only meets user needs in the short term,and the long run causes some potential pitfalls,both in terms of maintenance costs and overall code quality.Therefore,it is very important to detect self-admitted technical debt.Firstly,this paper applies the pre-trained deep learning model to the self-admitted technical debt detection task for the first time.In addition,an improved loss function is proposed to solve the unbalance problem of self-admitted technical debt.Experimental results show that the F1 score of the proposed detection model increases by 15.6% and 1.4% respectively compared with the other two baseline models.Secondly,this paper proposes an explanation scheme based on mathematical statistics,which can be used to explain the self-admitted technical debt detection method based on keyword matching.An example shows that this method can explain the outliers well.Combined with the existing deep learning interpretation tools,the outliers of experimental results are analyzed from multiple perspectives,and the essential differences between different detection models are found.Finally,combined with further data analysis work,a method of self-admitted technical debt detection model selection is proposed.This method proposes suggestions for different experimental environments and training data to detect self-admitted technical debt more efficiently and perform downstream tasks better on this basis.As an extension,this paper integrates the model selection method into the code detection work and puts forward a code detection process,which makes the research work of this paper more realistic. |