| Recently, the graph model has gained increasing attention. As a basic query on graphs, reachability queries have been extensively studied. It is also a fundamental component of subgraph isomorphic queries and is significantly supported in that reachability query may be frequently invoked to process large amounts of data and complex queries. Motivated by the huge amounts of graph data such as social networks, biological networks, and the semantic web, many of these real-world graphs are edge-labeled and display a property of uncertainty due to statistical errors and other reasons. A fundamental research problem concerning these labeled uncertain graphs is called the label constraint reachability query (LCR), e.i., given two vertices u and v, query the probability that u can reach v through paths whose edge labels are constrained by a set of labels. For uncertain labeled graphs, traditional reachability technology is no longer applicable.This thesis first introduces and defines the label and distance constraint reachability query and then investigates a subpath-based exact algorithm which combines divide-conquer algorithm and a branch path filter strategy to maximally compress the original graph and fast compute the reachability. To efficiently and accurately approximate the problem, several estimators are proposed based on the unequal probabilistic sampling framework, DC-Tree and Random Walk sampling methods. Next, this thesis introduces another problem-label order constraint reachability query which asks for paths satisfying a given label expression. This thesis proposes an iterative algorithm that utilizes the path index and effective prunings to locate edges containing the sequential labels. Then it constructs subpaths among those edges and connects them to form full paths. The Monte Carlo sampling algorithm is proposed to estimate the total probability for all paths between two nodes and two bounds are given using Bonferroni and Chung-Erdos inequalities and path/cut set. Finally, an extensive experimental evaluation on both real and synthetic datasets demonstrates the efficiency and effectiveness of our approaches in answering label constraint reachability queries in uncertain graphs.In summary, this thesis proposes several solutions, such as path index preprocessing technology, subpath pruning, branch path filter and other strategies based on the typical characteristics and challenges of label constraint query on uncertain graphs. All the methods are aimed at providing robust and efficient approaches for LCR problems. The algorithms and pruning ideas used in this thesis will lay a solid foundation on related research. |