| Graph is a general data structure which is used to describe the complex relationship between the entities.Heterogeneous graph is a kind of special structure with label,which is widely used in a variety of complex data modeling composed of different entities,such as information network,biological network application field and so on.How to carry out effective subgraph matching in the heterogeneous data map has become a research hotspot in recent years.The query graph is be given by users,and what we need is to find all subgraph structure from big data diagram which the nodes,the labels and the structure are same as the query graph.However,in practical people tend to focus on the matching results in high relatively degree,and thus it leads to the Top-K interest subgraph matching problem which is more targeted.It is mainly two parts,one is to find all matching subgraphs in the graph based on the query graph;the other is to get the first K graph from all matching graph according to the degree of interest by ranking K.In this paper,we study the Top-k subgraph matching problem in heterogeneous graphs.At present,there are some problems in the existing methods,For example,the subgraph matching problem of none-weighted graph on the static graph has not considered the user’s personalized needs,namely,ignored the processing of the weighted graph.In reality,the data graph may change because of time lapse or the semantic change,namely dynamic evolution.In the present study,little attention has been paid to the dynamic Top-k subgraph matching.Therefore,in order to solve the above problems,this paper consider the individual needs of users and puts forward a kind of effective Top-K interest subgraph matching method of the weighted query graph on static graph.At the same time,this paper proposes an incremental dynamic Top-k interest subgraph matching method based on the above research.The main contents are as follows:(1)This paper proposes an efficient matching method of static Top-K in subgraph(ERWM).Firstly,according to the characteristics of heterogeneous graph,it proposes a kind of storage structure similar to adjacency list graph which is to storage heterogeneous data.Secondly,it establishes two novel indexes: topological characteristics index NTFI node,edge feature index(EFI).The two indexes are designed to make full use of the information of nodes and edges to filter out unqualified in the query(mismatch)nodes and edges,and to get the smaller number of candidate sets.It avoids the unnecessary nodes in the matching phase and the edge connectivity is detected,thus reducing the redundant matching steps of the algorithm.Finally,it puts forward the query graph edge label setting method,and determines the sequence of subgraph matching verification according to the edges of query graph.At the same time,this paper introduces the idea of edge matching and edge sorting of RWM algorithm,improving the subgraph matching efficiency and reduce the redundant subgraph matching step.(2)This paper proposes an effective matching method of Top-K dynamic interest subgraph(DRWM).The method is the dynamic expansion based on the ERWM algorithm.First,it uses local dynamic sliding window model to produce the local data of dynamic heterogeneous graph.Secondly,it implements the Top-k interest subgraph matching based on incremental dynamic updating of Graphs with the idea of sliding window.(3)In the simulation data and real data,this paper is compared with the existing RAM methods and RWM methods on simulated data and real data.Then the experimental results are compared and analyzed,respectively,for the algorithm index construction and Top-k subgraph matching.This paper verified the Top-K interest subgraph matching method has good matching ability in the simulation and real network,which make the Top-K interest in subgraph matching greatly improved. |