Font Size: a A A

Research On Journal And Paper Network Based On Graph Auto Encoder

Posted on:2022-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:N WangFull Text:PDF
GTID:2480306332965449Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Internet technology makes the dissemination and access of academic papers more convenient,but at the same time it also brings information overload,making it difficult for scientific researchers to effectively dig out the required information in the massive journal papers.Therefore,literature mining and analysis with the help of artificial intelligence technology has gradually become a research hotspot.The existing paper recommendation and journal recommendation methods mostly use the abstract of the paper,the journal name and other characteristic information,often ignoring the rich relationship between the paper and the journal and the author,leading to unsatisfactory results.Based on this,this article abstracts the relationship between papers,journals,and authors into a heterogeneous network structure,that is,the journal and paper network,and then uses network-related methods to conduct research to dig out more information.In recent years,researchers have proposed many network research methods.Among them,the graph embedding method has gained a lot of attention.The main idea is to map all the nodes in the network to the vector space one by one,while preserving the structural relationship between the nodes.This graph embedding technology can be classified into three categories: methods based on matrix factorization,methods based on random walk,and methods based on deep learning.In the early days,most of these methods were based on homogeneous graphs containing only one type of node.Yet,heterogeneous information networks such as journal and paper network contain more complex structure information and semantic information such as node types and connection types than homogeneous graphs,and the methods in homogeneous graphs cannot be directly applied.And for the journal and paper network,the matrix factorization method is difficult to deal with the large number of network nodes,and the random walk method is difficult to capture the interaction between different types of nodes,and many methods in deep learning will ignore the type information of the nodes.Based on the above content,this paper proposes a node representation model for the journal and paper network to better implement multiple downstream tasks such as journal recommendation.In particular,the work done in this article can be divided into two parts:1.Constructed the node representation model HGAE of journal and paper network based on the graph auto encoder.First,a part of the data was selected and extracted from all the papers published by Pubmed,and a biomedical-related journal and paper network was constructed.Secondly,aiming at the constructed journal and paper network,a journal and paper network node representation model HGAE based on heterogeneous graph autoencoder is proposed.HGAE model decomposes the network into different sub-graphs and encode separately,then integrate the different sub-graph features and decode,maximizing the use of the different semantic information contained in the network,thereby getting the node embedding of the journal and paper network.Finally,the proposed HGAE method is trained on the public Aminer dataset and Pubmed dataset and the node embedding of the two networks are obtained.After visualizing,it is found that the node embedding of authors,venues and papers all have good clustering effects.2.the application of node represents obtained by HGAE model in multiple downstream tasksIn order to further study the applicability of the node representation obtained by the HGAE model,it is applied to a number of downstream tasks such as cooperative author prediction,paper citation prediction,journal recommendation,and author node classification,and compared with five network research methods such as GAT and Graph SAGE.The experiment found that the method proposed HGAE has better performance ability.On the two prediction tasks of author's cooperation prediction and citation prediction,the comprehensive evaluation index AUC value of HGAE on the Aminer2012 data set is 2.2%—5.1% higher than the Het Gnn method.On the journal recommendation task,F1 score is 10.1% higher than the Het Gnn method.On the classification problem,the accuracy is above 0.9.In addition,this paper also analyzes the effect of subgraph combination methods and node embedding length on the results.Experiments have found that the sub-graph combination method performs more stable,and as the embedding dimension increases,the prediction effect will increase correspondingly at the beginning,but after reaching a certain level,further increasing the embedding dimension will weaken the performance of the algorithm.In summary,the main work of this paper is to construct a node representation model HGAE based on graph autoencoder for heterogeneous journal and paper networks,and has fully demonstrated its good performance ability in multiple downstream tasks such as co-author prediction,citation prediction,journal recommendation,author node classification,etc.
Keywords/Search Tags:Graph Auto Encoder, Journal and Paper Network, GCN, Graph Embedding
PDF Full Text Request
Related items