Font Size: a A A

Research On Link Prediction In Scientific Collaboration

Posted on:2020-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:W GuoFull Text:PDF
GTID:2370330599976423Subject:Optical Engineering
Abstract/Summary:PDF Full Text Request
The effective collaboration between scientists plays an important role to promote the development of science and the dissemination of knowledge.Achievements such as monographs,research projects,and large-scale scientific projects can be considered as the efforts and contributions of many researchers.Therefore,how to improve the efficiency of scientific research cooperation and how to find suitable research partners have become a hot topic of common interest from academia and industry.Based on the theory of scientific collaboration network and complex network,the thesis studies the link prediction problem in the scientific collaboration network in order to explore how to reconstruct the network connections under the condition of lack of some network information.The main work of the thesis can be summarized as follows:Firstly,the related works about the theory of complex networks and link prediction technology are summarized.The link prediction evaluation indicators and research methods are introduced detailedly,including three research methods of link prediction.It includes a link prediction research method based on structural information,a link prediction method based on maximum likelihood estimation,and a link prediction using a probability model.Among them,the link prediction research method based on structural information is considered as the mainstream method.Secondly,research works on network data capture and data visualization are introduced.Not only the basic concept,common types and network data visualization theory of web spider are summarizes and analyzes,but also the realization principle and process of focusing on web crawlers and general web crawlers are discussed.Specifically,the use of Scrapy framework and data processing procedures are introduced in detailed.Moreover,two methods are used to realize the visualization of the captured data.The first method is by MATLAB programming.The MATLAB is used to simulate the network data,and the network diagram drawn by MATLAB is output.But this method is only suitable for networks with a small amount of data.The second method is by using professional visualization software,such as VOSviewer,Gephi,etc,to visualize the data.This method has achieved a good visualization effect.Thirdly,a meta-path computation prediction(MPCP)algorithm based on meta-path and random walk is proposed.The MPCP algorithm proposed in the thesis establishes two meta-paths: co-author meta-path(A-A-A)and common keyword meta-path(A-D-A).The MPCP algorithm forms the heterogeneous information network G=(A,D,R)from the authors and keywords data collected from Web of Science.The MPCP algorithm combines the meta-path with the random walk,and considers the overlap of the meta-paths to predict the probability of new link establishment under different link blocking thresholds.In addition,the recovery rate of the links is used as the evaluation index of the algorithm,and the actual research cooperation network cases are constructed by using quantum communication and link prediction respectively.The simulation results show that,as the link blocking threshold increases,the number of links recovered will decrease.In the worst case,when the maximum blocking threshold is 1,at least 50% of the broken links can be restored by MPCP algorithm.
Keywords/Search Tags:Scientific collaboration network, Complex network, Link prediction, Meta path, Random walk
PDF Full Text Request
Related items