| It is becoming a new research hotspot that heterogeneous information network can effectively fuse more structure information and contain rich semantics in nodes and links.which contributes to data mining tasks including node influence ranking,classification,clustering,relationship prediction and recommendation.The dual-hub heterogeneous information network structure is special,and it is composed by two subnets.The subnets are linked together by the hub entities.For such heterogeneous information networks,the relationship prediction between hub entities is of great significance to the individual recommendation among entities.In this paper,a dual hub heterogeneous information network,which combines social relations subnet and information subnet,is proposed as the main research object,and puts forward three stage analysis method containing the group node influence,user preference gene and user-item interest ranking.This paper takes Douban film network as the experimental data,and finally completes the relationship prediction between the social relations subnet and the film information subnet hub entitits.Firstly,this paper studies the group node influence problem in information subnet,and proposes an attention-introduced random walk model,named as AI-RWM.It adopts a random walk model comprised of type-driven and topology-driven to do group node influence analysis.Further user attention in social relations subnet is introduced to associate node influence and user attention in the type-driven random walk.At the same time,the influence of zero attention node is optimized by random transition probability repair mechanism.A series of contrast experiments demonstrate that node influence analysis in information network behind introducing user attention not only avoid the defect of quantity sensitivity,but also integrate user group interest,thus the ranking results are more comprehensive.Secondly,this paper studies the user preference gene problem in heterogeneous information network,analyzes the similarity between the "item" node pair and the"user" node pair based on meta-path,and proposes a metapath-based user preference gene model,named as MPATH-GENE.This model takes the relevant information subnet of the user as the analysis object.Firstly,the HeteSim algorithm is used to compute the correlation between the "item" node and its all attribute type nodes.Then,the correlation is converted to the link weight,and calculate all the shortest path between "item" nodes.Finally,the path is abstracted into the meta-path,and the preference gene of the user is calculated by analyzing the weight of the meta-path.Experiments show that user similarity based on meta-path and preference gene extraction is consistent.Finally,this paper studies the relationship prediction problem between the hub entities,and proposes a user-item interest ranking model based random walk model,named as UII-RWM model.This model fully combines the group node influence in the information subnet extracted by AI-RWM model with the user preference gene extracted by MPATH-GENE model,and adopts a two-type random walk to do node influence analysis.Meanwhile,the concept of collaborative filtering is introduced in topology-driven random walk,and a filling method of item attention which based on user similarity is designed,and accomplish the user-iterm interest ranking for a specific user through random walk,thereby completing the relationship prediction between the specific user and the item.Multi group contrast experiments prove that the model has a certain accuracy. |