| As an important data structure,graphs are widely used in practical problems in many fields such as chemical research,social networking,and biology.As a research hotspot of graph data management,subgraph matching has attracted more attention.As the amount of data increases exponentially,how to complete the graph matching process acceptable to computer memory in a limited time is a challenge for researchers.The existing subgraph matching algorithms usually adopt filtering and validation strategies,which are suitable for small-scale datasets.The increase in the scale of graph data and the redundant calculation problems in the matching process make the matching time complexity of the existing algorithms on large-scale graph data sets relatively high.In response to the above problems,this topic has carried out the following research:Firstly,aiming at the problem of large scale of data graph,a graph compression algorithm based on data graph is proposed.According to the label and topological structure of the node,the local influence value of the node is estimated,utilize the local influence of the node to find the judgment order of the equivalent node,by comparing node labels and neighbor information,we can judge whether they are equivalent nodes,and then classify them,the nodes that belong to the same equivalence class are compressed into a super node,finally,the data graph is compressed into a smaller hypergraph,which is conducive to the storage and mining of large-scale graph data.Secondly,in order to improve the efficiency of subgraph matching algorithm,an index construction method based on node coding is proposed to create the index of hypergraph.The index structure is pruned by using the filtering strategies such as the size of node degree,attribute label,neighborhood label filter and node inclusion relation to obtain the node candidate set.Thirdly,according to the node degree and its label of the query graph,the method of setting the node information is proposed,and the order of matching verification is obtained.The matching result is expanded by using the equivalence relation of nodes and combination strategy,and the whole matching process of subgraph is completed.Finally,the above algorithm and the existing algorithm are compared and tested on different real data sets to verify the time efficiency of the proposed algorithm. |