| With the strengthening of international academic cooperation and the rise of the "open science" movement,research collaboration and co-authorship have gradually become the mainstream trend for further development of modern science.Previous methods of evaluating the research achievements of a single author cannot fully reflect the diversity of author contributions in modern research collaborations,and can not distinguish the contributions of one author from those of other collaborators.In the scientific community,when making important decisions such as job appointment,title evaluation or fund award,the basis is basically based on the achievement evaluation of a scholar.Therefore,it is particularly important to determine whether scientific research literature can be attributed to a certain author.This thesis aims to improve the allocation algorithm by introducing the relative impact factor of literature,and removing the dilution effect brought by the reduction of the contribution weight of the target paper to the achievement owner,providing a supplementary solution to the problem of literature attribution.The scientific impact of literature is related to various factors,and the trend of changes in the number of citations of literature includes two types: "delayed rise-slow decay" and "early riserapid decay".Therefore,this thesis introduces a non-linear function to combine the impact of publications with the number of citations,characterizing the relative impact of publications.Due to the lack of unified identification standards for the attribution of scientific literature,this thesis chose the Nobel Prize,which is currently recognized by scholars worldwide,to verify the identification accuracy.The Nobel Prize in Physics,which has stricter nomination and award standards,was used as the validation dataset in this thesis.This thesis used the achievement owner of the identified results by the allocation algorithm as the judgment criteria,whether they were Nobel Prize-winning authors or not.By comparing the most comprehensive Nobel Prize literature data retrieved from the American Physical Society dataset and the Microsoft Academic Graph dataset,this thesis comprehensively compared the current mainstream frontier algorithms.First,this thesis introduced a modified sigmoid function to characterize the relative influence of literature.By removing the contribution weight of the target paper,the distinction between the achievement owner and other authors is improved,and the dilution effect brought by the weight is reduced.And proposed the NCCAS allocation algorithm based on this.Then,this thesis compared the identification accuracy and identification resolution of the mainstream algorithms CCA,NCCA,DCA,and Co CA allocation method.Finally,the robustness of the algorithms was tested.In addition,this thesis conducted ablation experiments to compare the impact of the target paper’s contribution on the allocation algorithm’s identification accuracy and resolution,and also explored the impact of constructing a score allocation matrix with different weights given to previous research results for the allocation algorithm.Furthermore,this thesis explored the distribution of academic indicators for the achievements owner(highest attribution score author)in different disciplinary fields and at different publication times of the literature data in the 9 disciplines included in the MAG dataset from 1990 to2009.It also investigated the association between different citation accumulation window lengths and the literature ownership.The findings follows:(1)Based on the comparative experiments mentioned above,it was found that the NCCAS algorithm proposed in this thsis demonstrated superior overall performance compared to other mainstream algorithms.(2)This thesis applies the NCCAS algorithm to calculate the massive scientific literature data in the MAG dataset and the relevant academic data of the achievement owners in the large and small teams are statistically analyzed.The analysis reveals that the distribution of academic indicators among achievement owners in the large and small teams: In the distribution of relative academic age,the achievement owners in the large teams are mostly non-senior scholars with co-authors,while the results in the small teams are mostly the oldest scholars.In the distribution of authorship position,the achievement owners of large and small teams did not show a bias toward the first author or the last author,but the first author in small teams was more likely to become the achievement owners than the last author in large teams.In the distribution of the number of interdisciplinary fields,achievement owners in large teams are more likely to not be the authors with the largest number of interdisciplinary authors,while achievement owners in small teams are more than 50% likely to be the authors with the largest number of interdisciplinary fields.In the distribution of the number of published literatures,achievement owners in large and small teams are mostly authors who have published the most among co-authors.Meanwhile,the absolute academic age of older and non-older scholars among achievement owners gradually shows the phenomenon of "academic aging".(3)In this thesis,we discuss the different citation accumulation time windows(T)on the stability of credit allocation algorithms for selected scientific publications in this study.The achievement owners that were not consistent in the(T,T+1)citation accumulation windows were defined as "subversive situation" to measure the correlation of different window sizes and the ownership of literature achievements.Additionally,the distribution of academic indicators under different window sizes was explored to determine if it still followed the patterns described in(2).Results showed that "subversive situation" were relatively small and decreased with an increase in the citation accumulation window size.When T≥7,this proportion approached zero.Furthermore,under different window sizes,the distribution and trend of achievement owners’ corresponding academic indicators remained consistent,but the specific interval proportions were affected by the citation window size.To sum up,this thesis introduces the NCCAS method,constructed based on the relative citation impact and removes the target paper’s weight,which can reasonably allocate attribution for high scientific impact(Nobel Physics Prize literatures are used as the data verification dataset)and ordinary academic literature.Meanwhile,based on the allocation results for ordinary academic literature,the regularity of the distribution of the corresponding academic indicators of the achievement owners in the large and small teams are found,and this distribution is not affected by changes in allocation algorithms,disciplinary fields,publication time,or citation accumulation window size.The phenomenon of "academic aging" is identified in the corresponding trend of academic indicators distribution,which has reference significance for China’s future talent introduction and scientific research. |