Font Size: a A A

Research On Network Representation Learning Based On Heterogeneous Information Fusion

Posted on:2019-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:Z M LiuFull Text:PDF
GTID:2428330596959458Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
A large number of complex systems in real life can be modeled as network forms for analysis.However,with the advent of large-scale social networks,traditional network representation methods based on network topology usually suffer from the issues of computational inefficiency and difficulty in efficiently integrating heterogeneous information.To this end,researchers began to research in Network Representation Learning(NRL),which aims to learn a low-dimensional dense vector representation for each node in the network by using machine learning methods,and the vectors can fully retain structural information and other heterogeneous information of nodes in the original network.Furthermore,it can be effectively used as a feature vector for subsequent network analysis tasks,such as node classification and link prediction.This paper aims to study the network representation learning method combining different heterogeneous information in different scenarios,so as to improve the performance of tasks related to network representation learning.These heterogeneous information includes node textual content information,node multi-dimensional categorical information,and edge signed semantic information.In recent years,although the fusion representation learning methods for the above three types of heterogeneous information have made some progress,the following shortcomings still exist:(1)The representation learning methods,which incorporated with node textual content information,only pay attention to the constraint effect of the textual content information on the network structure information,but ignore the reverse constraint effect of the network structure information on the textual content information,and have insufficient ability to mine the core semantics of node textual content information of complex multi-topics.(2)The network representation learning methods,which incorporated with node multi-dimensional categorical information,only consider the multi-dimensional categorical information as a priori feature to assist the representation learning process of the network structure information of node,lack the coping mechanism in the case of data loss,and have low robustness in the case of incomplete node multi-dimensional categorical information.(3)The network representation learning methods,which incorporated with edge signed semantic information,only model limited types of contextual link,or model the aggregated semantic information of different contextual links,lack refined modeling of different contextual links,and have weak ability to deal with complex edge signed semantic information.This paper conducts research on the above issues,and the specific work is as follows:1.Aiming at the problem of insufficient semantic mining ability of node textual content information of complex multi-topics in the existing fusion method,this paper proposes a co-coupled representation learning model based on parameter sharing.On the one hand,the representation model is used to model the constraint effect of network structure information on textual content information,and to mines the core semantic information of textual content.On the other hand,through the cross-iterative training strategy,the dynamic competition of the mutual constraint relationship between the two aspects of information is realized in the representation learning process,thereby obtaining network representations that are more suitable for the data scenario.The experimental results show that the proposed method can effectively model the mutual constraint relationship and improve the performance of node classification tasks.2.Aiming at the problem of low robustness of the existing fusion method in the case of incomplete node multi-dimensional categorical information,this paper proposes a representation learning model based on stochastic perturbation and homophily constraint.On the one hand,data set transformation is carried out by stochastic perturbation strategy to improve the adaptability of the model to incomplete information.On the other hand,in the process of learning fusion representation vectors,an attribute similarity preserving method based on homogeneity principle,is designed to mine the effective semantic information in incomplete information.The experimental results show that the proposed method can effectively deal with the problem of incomplete information and improve the performance of node classification and link prediction tasks.3.Aiming at the weakness of existing fusion methods in dealing with complex edge signed semantic information,this paper proposes a representation learning model based on prediction of contextual links between nodes.On the one hand,a neural network-based binary classifier is designed for relationships prediction,which is used to model different types of contextual links,and then to explore complex semantic relationships between nodes.On the other hand,the contextual link relationship sampling method based on random walk is designed to adapt to the training requirements in large-scale network scenarios.The experimental results show that this method can mine the complex semantic relations between nodes effectively and improve the performance of link sign prediction task.
Keywords/Search Tags:Network Representation Learning, Heterogeneous Information Fusion, Cross training, Stochastic Perturbation, Contextual Link
PDF Full Text Request
Related items