| As the most popular Open Source Software(OSS)hosting site,Git Hub has accumulated 21 TB data of OSS.As of 2021,there are 73 million developers in Git Hub’s OSS community in total,and there are over 26,400 repositories in totoal.However,with the rapid development of OSS community,the number of distributed developers and the size of OSS code are also increasing rapidly with bothers integrators and developers for the collaboration problems in the distributed development environment.In the perspective of developers,due to the inadequate and mostly delayed knowledge of the project they acquire,it’s possible to work on the same task which is regarded as duplicate contributions,usually useless.The duplicate contributions are often refused by integrators which wastes the effort of developers,burdens integrators and also consumed the resources in community.In the perspective of integrators,faced with a large number of contributions and tasks submitted to the project,it is significantly important to merge contributions and solve tasks correctly and efficiently.In the mean time,it is also important for integrators to make use of the software knowledge in the project in enhancing the comprehension of developers about the project.The rich software knowledge of the project would improve the capability of developers in contirbuting.This paper focuses on gathering software knowledge of the open source community and improving the contribution efficiency of the open source software community.Aiming at solving the problems of conflicts among resources units and lackage of software knowledge in the continuous iterative process of open source software,this paper fully explores the task and contribution unit in the Issue Tracking System.We also take advantages of information such as text description,code changes,creation time,and user creation to study the practical usage of the link mechanism in open source software,and propose an automatic classification technology of open source software tasks and contribution units and an automatic recommendation method about the related resources units in open source software.The main work and contributions are summarized as:Firstly,we conduct an empirical study about the practical usage of link in OSS project.The study analyzed 5 large OSS projects hosted on Git Hub.In the study,we answered 5 research questiones about what are the link types,where are links used,when do links occur,how do links organized and why are links used.We also provided practical suggestions to the developers and integrators in OSS community for better use of link mechanism.Secondly,focusing on the comprehensive types of contributions and tasks in OSS project,we proposed an automatic classification tool,named as TiTIC.TiTIC used not only the traditional text feature but also the noval time feature to improve the performance of classifcation tool.The evluation experiments were conducted on 26 popular projects on Git Hub,the results showed that TiTIC performance 5.77% higher than traditional textbased method in precision.Finally,aiming to release the burdens of the comprehensive reasons why links are used,we proposed an automatic link recommendation tool,named as KG-Linker.KGLinker creatively used knowledge graph techniques in OSS project.In the research,we firstly proposed a method of constructing knowledge graph for an OSS project,containing the historic contributions and tasks,users,creation time etc.Then we applied knowledge embedding models to vectorized the OSS proejct’s knowledge graph in order to conduct recommendation tasks of related contributions and tasks.The evaluation was conducted on 5 OSS projects and we evaluated the coarse-grained recommendation and fine-grained recommendation performance of KG-Linker.The results showed that the MRR of KGLinker is increased by 3.38 times,Hist@1 by 3.29 times,Hits@3 by 1.92 times,and Hits@10 by 1.53 times. |