Font Size: a A A

Research On Link Prediction Methods For Multiple Types Of Information Networks

Posted on:2021-12-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:D LiFull Text:PDF
GTID:1480306338979689Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Entities in real world are often interconnected,forming information networks.Here entities can be items,papers,meetings,authors,pictures,movies,directors etc.,and relationships can be purchase,publication,watching,acting,directing etc.Link prediction is a necessary technique to predict links among nodes or to predict signs of unlabeled links via the observed nodes and network structure in information networks.These predicted links can be links that actually exist but have not been observed,or the links that might appear in the future.Link prediction has become an important research area in the field of data mining.It is useful to make users understand the generation and evolution of networks better.Although there are a lot of studies on link prediction recently,they are still insufficient due to the heterogeneity,incompleteness and complementarity of information networks.Firstly,existing link prediction techniques regard all nodes and relationships equally or analyze nodes and relationships independently.The types of entities or relationships are often ignored.Current techniques lack the effective fusion of multiple features.Secondly,most of existing methods for link prediction in signed networks tend to rely on the features only from labeled ties.However,in incomplete signed networks,little information about labeled ties is available.Often,there is not enough sample data to train by machine learning models.Thirdly,existing methods mainly rely on the features from social links but ignore users' behavior.Moreover,traditional work defines the features only to capture the dependencies between a single predicted result and the evidence within one task.The interaction among multiple tasks is often ignored.Therefore,as for four types of information networks(i.e.heterogeneous information networks,incomplete signed networks,social recommendation networks and multiple information networks),based on graph theory,information theory,machine learning etc.we present corresponding link prediction models and algorithms respectively.We make the following contributions:(1)For heterogeneous information networks,we present a Hierarchical Hybrid Feature Graph(HHFG)based link prediction method.Firstly,we present a HHFG model.Unlike traditional link prediction techniques,HHFG model distinguishes different entities and relationships by capturing both entity features and tie features.The model fully considers structure features,semantic features and time features and represents them hierarchically.Abundant features from information networks are organized effectively.Secondly,a HHFG-based link prediction algorithm is proposed.On one hand,it performs random walk on HHFG based on the hybrid features.We estimate possibilities of forming links among nodes by calculating the probability of random walk.On the other hand,the parameters such as feature weights and transition coefficients are learned by gradient descent method to effectively guarantee the accuracy of link prediction.Finally,the experiments demonstrate the feasibility and effectiveness of our HHFG-based link prediction method by comparing it with other link prediction methods.(2)For incomplete signed networks,we present an Unlabeled Ties based Link Prediction(UTLP)method,in order to predict the signs of unlabeled ties.Firstly,unlike traditional link prediction methods,based on social awareness theories,UTLP model can utilize the features of both labeled ties and unlabeled ties.Secondly,we adopt the transfer learning algorithm with instance weighting to utilize more useful training instances in the source network to train the model and predict the signs of links,which can effectively make up for the incompleteness of target samples.Finally,we conduct experimental studies based upon Epinions,Slashdot and Wiki-RfA.Experiments demonstrate the effectiveness and the efficiency of our proposed method compared with traditional methods.(3)For social recommendation networks,we present a model by integrating Sign Prediction with Behavior Prediction(SPBP).Firstly,we adopt deep learning-based embedding technique to extract users' representations.Then we use these representations and users' behavior to estimate social correlation and behavioral correlation between users respectively.Secondly,we propose an iterative prediction algorithm by integrating Sign Prediction with Behavior Prediction.It improves the accuracy of prediction by taking full advantage of interaction between sign prediction and behavior prediction.On one hand,for sign prediction,we utilize the features from both labeled links and unlabeled links to extract social psychology theories-based features.We propose a sign prediction algorithm based on transfer learning.On the other hand,for behavior prediction,we take into account the social features of users and distinguish between positive links and negative links which can provide more informative evidence for behavior prediction.Finally,extensive experiments conducted on real-world social recommendation networks(i.e.Epinions,Slashdot and Wiki-RfA)demonstrate that SPBP can effectively solve both the sign prediction problem and the behavior prediction problem.(4)For multiple information networks,we present an Information Networks Fusion Model based on Multi-task Coordination(MC-INFM).Different from traditional models,MC-INFM casts the fusion problem as a probabilistic inference problem,and collectively performs multiple tasks(including entity resolution,link prediction and relation matching)to infer the final result of fusion.Firstly,we define the intra-features and the inter-features respectively and model them as factor graphs.Secondly,we use Conditional Random Field(CRF)to learn the weights of intra-features and inter-features.Thirdly,we propose an iterative inference algorithm based on multi-task coordination.We infer the results of these tasks simultaneously by performing the maximum probabilistic inference.Finally,experiments demonstrate the effectiveness of our proposed MC-INFM model.
Keywords/Search Tags:link prediction, heterogeneous information network, incomplete signed network, social recommendation network, multiple information networks
PDF Full Text Request
Related items