Font Size: a A A

The Research On Scholars' Interest Tags Discovery Based On Academic Networks

Posted on:2021-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:L GaoFull Text:PDF
GTID:2370330629488938Subject:Engineering
Abstract/Summary:PDF Full Text Request
User Profiling is the process of tagging user information and embodying user image,which has been widely used in smart marketing,computational advertising,personalized recommendation and other fields.Users'interest tags are one of the basic contents of user profiling,it depict users'interest preferences and capture changes by the way of tagging users.The rapid development of academic research has led to the generation of academic big data.Based on these data,it is possible to construct scholars'research interests profiling.Previous studies have mainly extracted scholars'interest tags from academic text data.In contrast,there have been fewer studies based on academic networks to discover scholars'research interests.Therefore,this paper abstract the discovery problem of scholars'research interests into a multi-label classification problem,providing the interest tag space is known.By constructing large-scale undirected collaboration network,directed collaboration network,and citation network with scholars as nodes,we extract the feature representations of scholar nodes from three academic networks with different network representation learning methods according to the characteristics of the scale and structure of these networks,and then design and implement a multi-label classification model to tag scholars with unknown research interests tags in academic networks.The main work of this paper includes the following three aspects:?1?Based on the“Open Academic Data Challenge 2017”datasets provided by Biendata and Computer Science top field tag data crawled from Microsoft Academic,we construct collaboration networks and citation network with millions of nodes and tens of millions of edges,and in order to more accurately identify the core scholars in the undirected collaboration network,a directed collaboration network with the first author as the core is constructed to supplement the missing node affiliation information of the undirected collaboration network to better describe the research interests of core scholars.?2?Under the GraphVite framework,learning model of large-scale undirected collaboration network,directed collaboration network,and citation network is implemented,and feature vectors of scholar nodes are extracted as input features of multi-label classification models.?3?A C2AE-based multi-label classification model for scholars'research interests is constructed,and the features of scholar nodes extracted from three types of academic networks are classified for training and testing.In order to characterize the research interests of scholars in the academic network more accurately,an improved tag fusion method based on weighted voting method is proposed to fuse tags prediction results on the test datasets from three types of academic networks.The experimental results show that the method of tag fusion is 3.78%,10.7%,and 0.28%higher on theMicroF 1than the interest tags predicted by the undirected collaboration network,directional collaboration network,and citation network,respectively.At the same time,hamming loss is reduced by 0.68%,1.9%,and 0.06%,respectively.In addition,this paper compares the multi-label classification model based on the C2AE algorithm and the model based on the MLKNN and BPMLL algorithms.The result shows that the C2AE-based tag fusion result is4.39%,5.88%higher than MLKNN and BPMLL on theMicroF 1,and the hamming loss is 0.78%lower than MLKNN,but 0.65%higher than BPMLL.In general,the C2AE model adopted in this paper is slightly better than MLKNN and BPMLL under the current academic datasets.
Keywords/Search Tags:Scholar profiling, Academic network, Network representation learning, Multi-label classification, Tag fusion
PDF Full Text Request
Related items