Font Size: a A A

Graph Kernel Based Link Prediction In Heterogeneous Information Networks

Posted on:2022-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2480306476483154Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Heterogeneous information networks to contain rich structural information and semantic information,which can visually and flexibly distinguish the differences between objects and relationships in interactive systems.Link prediction is a basic problem with graph mining.It estimates the existence probability of the link between two nodes based on the existing information on the observation network,which can be used as the basis of solving many tasks in data mining.Most of the existing research methods only focus on the network topological features and ignore the node attribute information,and the similarity measurement methods used for link prediction often consider the path similarity between nodes based on the meta-path.This thesis extracts node attributes information based on data characteristics,and combines graph kernel theory to predict heterogeneous information network links,analyzing the problem with two levels: node attributes and network topological features.The main work of the thesis are as follows:(1)Generate subgraphs and vectorized node attributes.By searching for meta-paths,useful information about objects of the network is filtered.Combined with the depth-first traversal method,an automatic meta-path generation method is constructed.Based on the generated meta-path set,the heterogeneous information network is pruned around the target node pair,and a subgraph composed of path instances passing through this node pair is obtained.Faced with the node attribute information,the feature words are selected using the TF-IDF method and the similarity measurement method,and the Glove model is used to generate the word vector to represent the feature information,and vectorize to represent the node attributes of the target node pair.Using meta-paths to generate subgraphs and vectorize node attributes to provide a basis of follow-up research.(2)Link prediction method based on graph kernel.Extract the similarity features of the subgraphs,use SVM to learn the features of the subgraphs,and predict the possibility of link existence.Combining the graph kernel theory to measure the similarity in the subgraphs,the graph kernel maps the graph to the Hilbert space,and calculates the similarity between the two graphs in the Hilbert space.Then,according to the feature vector obtained by the graph kernel method,the SVM is trained as a link prediction model to obtain the link prediction classification result.At the same time,considering the network topology and node attributes information,a more comprehensive feature can be obtained to improve the prediction accuracy.(3)Experimental verification and analysis.On the four sub-data sets of the Aminer data set,the method NGLP proposed to this thesis is compared with three supervised learning models and four score-based models respectively for experimental verification and analysis.The measurement results of the method of this thesis are improved on different sub-data sets.The experimental results show that the algorithm in this thesis improves the accuracy while ensuring a certain degree of stability.
Keywords/Search Tags:Heterogeneity Information Network, Link Prediction, Meta-path, Graph Kernel
PDF Full Text Request
Related items