Font Size: a A A

Research On Key Techniques Of Reachability Query Of Graph Data

Posted on:2015-03-05Degree:MasterType:Thesis
Country:ChinaCandidate:P XueFull Text:PDF
GTID:2180330473953624Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Reachability query is one of the main research directions on graph data, and receives more attention from current researchers. With coming of big data and the expansion of the scale of data, more and more complex structured data are represented as graph data. Reachability query refers to the ability to get from one vertex to another within a graph. It has been researched for over 20 years, and is an important topic in the field of database. So far, reachability query has been applied in many other fields. In social network, every person can be regarded as a vertex and reachability query can help us find relationship between two persons. In protein networks, vertices are represented as molecules and edges as reactions. Reachability reflects reaction characteristics among molecules. In addition, citation network and graph pattern matching also use the function of reachability query.How to improve the performance has always been a crucial work for the reachability query, and a lot of researches process around this. With the further researching, new types of reachability query are proposed, such as label-constraint query. In this thesis, with the features of current graph data, we study the optimizing approaches for different definitions of reachability queries over graph data. Firstly, with combining the concept of interval labeling and path dividing, we propose a path interval labeling approach. The previous multiple interval approach has to iterate multiple times, especially when two vertices have a reachability relationship. This equivalents to traverse the graph by DFS, so it’s required to be optimized. Therefore, we add path information in the index, which can reduce iteration times and query time. Then, based on different paths, we propose the ESQ query strategy, in which different paths connect each other by equivalent edge, and we record the information of equivalent edge in index. Comparing transitive closure, it can reduce the index space and achieve better performance on query processing time. Secondly, for the reachability query with constraint, existing works only focus on the weight constraint of indirect graph. Therefore, we define the concept of the weight constraint on directed graph and address the problem of reachability query on weighted graph. Because of the additional weight of directed graph, we create the index for the directed graph according to transitive closure, and propose an optimized algorithm for the transitive closure. Finally, based on the approaches designed for the acyclic graph, we propose the approach that transforms the directed graph into weighted directed acyclic graph. Our approach uses reversing topological order to compute transitive closure, which reduces repeated search on edges and can create transitive closure with less time.Some necessary proofs of proposed theories and detailed illustrations of related algorithms are shown in this thesis. On the basis, we verify performance of our algorithms on both real and simulated datasets, and results show that our approaches can improve execution efficiency of reachability query.
Keywords/Search Tags:directed graph, reachability query, path dividing, weight constraint
PDF Full Text Request
Related items