Font Size: a A A

Evaluation And Prediction Of Scientific Literature Impact Based On Citation Semantic Linkage

Posted on:2020-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:2428330596491445Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of science and the advancement of information technology,the number of scientific literature has increased very fast,which leads to researchers to consume a lot of time and energy in finding high-quality literature.In order to help researchers mine high-quality literature and catch up with the recent scientific advances more conveniently,evaluation and prediction of the scientific literature influence has received extensive attention in the academic community.Most of the existing research usually evaluate literature influence based on the citation counts.However,these methods treat all citations as equally important,ignoring the influence of the literature on a variety of factors,such as the correlation of topics between the cited and citing papers,publication date of the paper,and the authority of the venue in which the paper was published.On the other hand,it is unreasonable to evaluate the influence of recent scientific literature through citation counts,because a newly published paper needs time to get enough citations.The purpose of this thesis is to use the semantic linkage of scientific corpus to analyze the impact factors of the influence of papers,and then objectively evaluate the current influence of papers and predict the future influence of papers.The main work of the thesis is as follows:(1)Aiming at the problem that traditional evaluation methods ignore the differences in citations and based on Page Rank,a scientific publication influence evaluation algorithm(STVRank)is proposed.Firstly,papers are modeled from the perspective of paper semantic linkage,quantitative analysis of topic correlation between citing and cited papers,and analyze the impact of time interval factors,venue influence factors of citing papers.Secondly,when improving Page Rank,three factors are considered its weighting design,and then the ranking result is used to evaluate the influence of papers.Finally,experimental results on the data set AAN demonstrate that compared with baseline methods Page Rank,WC,and SPRank,the STVRank algorithm can significantly improve the effectiveness and robustness of the evaluation results of scientific paper influence.(2)Aiming at the problem that the existing methods are difficult to identify the potential influence of recent papers,a gradient boosting regression tree-based method GBDT-Hot,is proposed to predict their future influence.Firstly,it analyzes the factors that drive the change of future influence of scientific papers from three aspects: paper,author and journal.At the same time,according to the semantic linkage between keywords,the research topic attention is modeled,and then the literature attention is analyzed the role of the future influence of the literature.Secondly,based on the gradient boosting regression tree model,a prediction model of the literature future impact is constructed to predict the count of citations the literature will obtain in the future.Finally,experimental results on the data set AAN demonstrate that compared with the baseline models TCM and FIP,the performance of GBDT-Hot model in prediction accuracy is better.The results also indicate that the role of literature attention is highly competitive when predicting the future impact of literature.
Keywords/Search Tags:literature influence, semantic linkage, PageRank algorithm, gradient boosting regression tree algorithm, evaluation and prediction
PDF Full Text Request
Related items