Font Size: a A A

Approximate Query Techniques Research Based On BSP For S-t Shortest Path Queries Over Big Graphs

Posted on:2016-05-22Degree:MasterType:Thesis
Country:ChinaCandidate:J Y ZhangFull Text:PDF
GTID:2370330542489422Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Currently,query processing and computation technology in centralized environments cannot meet the demand of real data analysis with the drastic increase of data size,so parallel processing systems adapted to various applications have been paid more attention.The s-t shortest path distance query is most widely used in graph-oriented applications,and the existing techniques in centralized implementation have been effectively explored but cannot be adapted into distributed environments directly.To solve this problem,based on BSP model which is extensively leveraged in parallel graph processing framework,this thesis mainly studies the shortest distance approximate query techniques.Specifically,based on different topology structures of undirected and directed graphs,different optimization strategies are proposed separately.First,for approximate queries of undirected graph,the index techniques of local vertex coverage are presented to compress the big graph,and thus the compressed graph is obtained by constructing reference nodes and boundary nodes.To ensure the accuracy of queries,the remaining vertexes are organized according to the situations of outgoing and incoming edges.On the phase of on line query,the shortest distance is gained by bi-directed BSP iterative comparison mechanism.Finally,the query result is estimated within the upper bound and the effectiveness is verified by the experiments.On the other hand,faced with directed graph,which is rarely researched but is widely applied,the new level constructing strategy is put forward,and the result is returned with a precision guarantee on the compressed graphs.Firstly,the first level graph is obtained by the local betweenness algorithm,and all vertexes are stored effectively.Next,the specific relax operation is selected by the numbers of outgoing and incoming edges,and some vertexes are deleted or reconstructed to build smaller compressed graphs.Thirdly,according to the highest level graph which source and destination vertexes belong to,we can find the new query vertexes on the highest level graph by backward tracking.At last,the optimization of the basic BSP iteration query process is proposed,which aims at early termination based on hops.One is bounded by statistical data,and the other is based on the current distance value distribution.The latter may lead to errors,but the probability assurance can be gained.The experiments verify the effectiveness of the proposed methods.The two framework and corresponding optimization methods proposed in this thesis have offered an effective solution for shortest path queries and extended queries over large scale graphs.Theoretical analysis and tests over extensive real-world graphs have demonstrated our proposed methods can gain significant performance gain when the query precision is guaranteed.Also,the proposed framework will be valuable for designing other approximated query processing techniques over large graphs.
Keywords/Search Tags:BSP model, directed graph, approximate query, index technology, sketch strategy, s-t shortest distance
PDF Full Text Request
Related items