Font Size: a A A

Research On Early Prediction Of Highly Cited Papers

Posted on:2022-03-18Degree:MasterType:Thesis
Country:ChinaCandidate:W B WangFull Text:PDF
GTID:2480306524983589Subject:Physics
Abstract/Summary:PDF Full Text Request
As a new interdisciplinary subject,complex networks involve different subject ar-eas such as mathematics,physics,computer science,and sociology.With the exponential growth of various data that can be abstracted into a network structure,complex networks have played their unique role in more and more aspects.Especially in the field of science of science(SciSci),many excellent results have appeared.For example,predicting the citation number of papers in the future has always been a hot research direction.Many scholars have tried to build models based on complex networks,so that the models can predict the dynamics of subsequent citations of papers based on the existing early infor-mation of the paper.The work of this thesis is to use the early information after the publication of a paper to predict whether it has the potential to become a highly cited paper in the future,that is,the cold start problem of highly cited paper prediction.At present,most studies on highly cited papers use a complete citation data set,in other words,they use the long-term citation information after the paper is published to predict whether it can become a highly cited paper.This type of research does not consider the practical significance of the prediction,whether the paper can become a highly cited paper in the early stage after the paper is published.If the highly cited papers can be effectively predicted in the early stage,then the future research direction can be effectively found,scientific research resources can be reasonably allocated,and related scientific research difficulties can be overcome.Find scientific research talents as soon as possible for further development.The main work of this thesis is as follows:(1)Introduced the research status and motivations of the research related to the paper citation prediction and scholars'academic career prediction in SciSci,and summarized the current mainstream three types of citation prediction methods.Subsequently,the related theories,indicators and methods used in this thesis are reviewed,and 7 indicators are proposed to use the early information of papers to predict highly cited papers.These 7 indicators are divided into network-level indicators and individual-level indicators.(2)Analyzed and process the data set used in the thesis,and performed correlation integration and data cleaning of the citation data set,as well as preliminary analysis.(3)Used the above indicators as features,analyzed the citation dynamics of highly cited papers from the perspective of statistical analysis based on individual-level indi-cators.The network index uses the PageRank index and LeaderRank index based on the citation network.Based on the above network index,the highly cited papers are predicted.It is found that the addition of the PageRank index and the LeaderRank index can effec-tively improve the prediction accuracy of the machine learning models.Only through these 7 indicators,we can effectively predict highly cited papers.
Keywords/Search Tags:complex network, science of science(SciSci), PageRank, LeaderRank, pre-diction of highly cited papers
PDF Full Text Request
Related items