Font Size: a A A

The Application On Prediction Of Scientists’ Hot Streak Based On Academic Big Data

Posted on:2024-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ZhuFull Text:PDF
GTID:2530307106990149Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of science itself,the Science of Science(SOS)has become an important research field,with the aim of understanding,quantifying,and predicting scientific research and its outcome.The evolutionary patterns of individual scientists’ careers are an important research direction in the field of SOS.In recent studies,a specific period has been identified in which an individual’s performance is significantly better than their typical performance.This period is called the hot streak.By capturing the hot streak in a scientist’s career using the hot streak model(HSM),the evolutionary patterns of scientists’ careers can be quantitatively analyzed.As the widely existing hot streak substantially improve scientists’ careers,effective prediction of hot streak is of great significance in identifying research talents and advancing scientists’ career development.However,due to the complexity of scientists’ careers,there is still a lack of effective methods to predict hot streak.Understanding the potential rules underlying the occurrence of hot streak can improve the academic impact and performance of scientists by adjusting their career development paths.In recent studies,exploration behavior and exploitation behavior are two important factors that may affect the occurrence of hot streak: scientists’ research behavior before the hot streak tends to be exploratory,while during the hot streak,it tends to be exploitative.Moreover,the beginning of the hot streak is not related to independent exploration or exploitation,but to the behavior sequence of "exploration-exploitation," i.e.,the transition from exploration to exploitation.Additionally,as one of the most prominent features of modern science,team collaboration also plays an important role in scientists’ careers.The increasing proportion of collaborative publications and the expanding size of research teams indicate a growing trend towards dependence of research outputs on collaboration.Therefore,exploring the relationship between research team collaboration patterns and scientists’ hot streak can deepen the understanding of the potential rules underlying hot streak.Furthermore,by combining existing research results,reasonable career development suggestions can be made to scientists to help them adjust their career development paths and improve their academic impact and performance.This article first proposes an optimized hot hand period model,and on the basis of verifying the effectiveness of the model,constructs a hot hand period prediction model;Secondly,team entropy was proposed to quantify the scientific research behavior of a team of scientists,and the potential relationship between cooperation patterns and hot periods was discussedIn this thesis,we first proposed an optimized hot streak model,and on the basis of verifying the effectiveness of the model constructed a hot streak prediction model(HSPM).Secondly,team entropy was proposed to quantify the scientific research behavior of a team of scientists,and the potential relationship between collaboration patterns and hot streak was discussed.Finally,we constructed a career analysis system for scientists.The main research content of this thesis includes the following:1)We obtained and preprocessed the American Physical Society(APS)dataset and academic social network data Arnet Miner.By investigating the temporal colocation between high-impact publications in the selected datasets,we confirmed the existence of hot streak in scientists’ careers.We extracted and analyzed the individual-specific HSM parameters of scientists in the datasets.The hot streak prediction problem was transformed into a micro prediction problem of scientists’ individual careers,that is,predicting the academic impact of scientists’ works in their future careers,and then capturing the hot streak that scientists may have in the future through the HSPM.The classic method Q-model which is based on the evolution of individual careers in SOS and a recent micro prediction method for predicting scientists’ future individual careers based on Recurrent Neural Network(RNN)were used to construct the HSPM.We conducted experiments to compare the models and found that the scientists’ future individual careers micro prediction method is more effective than the Q-model.2)We discussed the potential relationship between the collaboration patterns of research teams and the hot streak.Firstly,team entropy was proposed to quantify the team research behavior of scientists.It was found that team entropy was systematically higher before the start of the hot hand period than during the hot hand period,indicating that scientists tend to conduct research activities in a single fixed team during the hot streak.By observing the changes in team size,team freshness,and scholar credit before and after the onset of the hot streak,it was found that the team size during the hot streak was significantly larger than before.Scientists tend to connect with new collaborators during the hot streak to expand the team size,and they tend to choose more established scientists in the current research field when selecting new collaborators.The above conclusion confirms the different positioning of small and large research teams in innovation and the "exploration-exploitation" behavior sequence that occurs before and after the onset of the hot streak,providing a theoretical basis for proposing reasonable career development suggestions for scientists.3)Based on the HSPM and the potential relationship between collaboration patterns and hot streak,we implemented an integrated career analysis system for research scientists.In this system,we constructed modules for positioning scientists’ career periods,proposing career development suggestions and conducting integrated data preprocessing respectively.Currently,the system uses 236,884 scholars who have been matched with their 425,116 publications,including 4,174,320 citations,from journals under the American Physical Society(APS)after name disambiguation.In addition to basic information on publications and scientists,the main function of the system is to provide reasonable career development suggestions by analyzing the career history data of registered users.
Keywords/Search Tags:Hot Streak of Scientists, Hot Streak Model, Hot Streak Prediction Model, Collaboration Pattern, Career Development Suggestions
PDF Full Text Request
Related items