Research On Reachability Indexing For Very Large Graphs

Posted on:2016-12-04

Degree:Master

Type:Thesis

Country:China

Candidate:Z Li

Full Text:PDF

GTID:2310330503977888

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Graph which is composed of vertices and edges can be used to depict objects and their relations. Graph has been widely used as data model for many domains, such as Semantic Web, bioinformatics, ontology, and so on. Applications of these areas need to deal with scalability issues on mega-scale data. One of the fundamental graph operations is reachability query. Reachability query asks whether there exists a path between two vertices. Current research focuses on very large graph data with the advent of new applications. Existing methods use index to compress transitive closure and trade-off indexing time, space and query time. In this thesis, we first give vertices a level number after topology sort and then according to the properties of the level number in number theory, we select part of ancestors and descendants of each vertex to construct 2-Hop label. The main contributions are:(1) Give the formal definitions and theoretical basis for the label. We introduce the concept of order on vertices and choose part of order-increasing ancestors and descendants of each vertex to construct 2-Hop label. The correctness of the label is ensured by level number's properties in number theory. In addition, according to the properties in number theory, we put forward an effective method to deal with cross-level edges by adding no more than one dummy vertex for each cross-level edge to maintain the reachability between ancestors and descendants.(2) Design index construction algorithm. By introducing the concept of increasing sequence and decreasing sequence and using their properties in number theory, we propose an efficient index construction algorithm to avoid computing the transitive closure.(3) Optimize the label. We delete dummy vertices to reduce the size of the label and delete high degree vertices based on dummy edges to reduce the size of the label in the construction process.(4) Experiment on large real graphs and large synthetic graphs. We evaluate all methods in terms of the indexing time, space and query time. First we analysis the effect of the optimization. The optimization of deleting dummy vertices will bring some cost of computing, but it can effectively reduce the size of the label. The choice of K in deleting top K high degree vertices based on dummy edges has little effect on performance. We also compare our algorithm with the best existing methods and our algorithm has some advantages in indexing time and query time. Finally, we experiment on large synthetic graphs to assess the scalability.

Keywords/Search Tags:

Reachability Query, Graph, Index Algorithm, 2-Hop Labeling

PDF Full Text Request

Related items

1	Large Graph Reachability Query Research On Distributed System
2	Reachability Query Based On Weight Constraint In Large Graph
3	Research Of Cost C Reachability Query On Graph Data
4	Research On K-hop Reachability Query Algorithm On Large Graph
5	Research On Reachability Query And Optimization Method For Large-Scale Directed Acyclic Graphs
6	Research On Key Techniques Of Reachability Query Of Graph Data
7	Research On K-step Reachability Query Algorithm Based On Weight Constranit
8	Research On K-step Reachability Query Processing On Label Constrained Graphs
9	Studies On Query Processing Techniques Over Uncertain Graph Data
10	Research On Reachability Query Algorithm Based On Interval Extension