Font Size: a A A

Research On Large Scale Network Analysis System Based On Distributed Graph Calculation

Posted on:2019-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:B ZhaoFull Text:PDF
GTID:2310330545955616Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,data that are originally independent or sparsely connected are getting closer and closer together,generating a large amount of data with complex relationships.These complex relationships require graph analysis techniques to accurately reflect the true state of the data.However,on the one hand,the existing graph calculation techniques all analyze the static graph.However,the data models in the real world change from moment to moment,and the existing static calculation models are not enough to reflect the real state of the network.On the other hand,due to the complicated structure of the graph,it is difficult to reduce the computational complexity of the graph,and the distributed communication cost is also high.These reasons have always restricted the development of graph computation and even the prospect of graph analysis.This paper first studies the dynamic storage structure,proposes a distributed graph storage architecture TS Graph suitable for timing graph storage,the underlying storage model uses the BigTable model,and uses HBase as the back-end storage.Through the optimization of storage structure,the design of RowKey and the query index optimization,the storage architecture can quickly query a static snapshot of the dynamic graph in the time dimension to solve the problem of persistence analysis of the dynamic graph.Secondly,this paper studies the implementation details of incremental graph algorithm,and designs and implements an incremental graph algorithm based on Spark GraphX.Experimental results show that the algorithm can greatly improve the efficiency of graph computation with a certain level of tolerance loss.Compared with the overall efficiency of full-scale algorithm with each incremental graph being 10%-20%Can be improved about 3 times.At the same time,the problem that the incremental algorithm will lose the accuracy and the scheme of full correction is adopted to ensure that the calculation result always falls within the tolerance of some accuracy loss.Finally,based on the dynamic storage system and incremental computing system mentioned in the paper,an integrated graph analysis platform OpenGraph is designed and implemented.The graph analysis platform includes four modules:data management loading,graph relation inquiry,incremental algorithm real-time operation and graph attribute index analysis.The analysis platform can store and analyze large-scale dynamic graphs.
Keywords/Search Tags:TSGraph, OpenGraph, Dynamic Graph, Incremental Calculate, Graph Snapshot, Spark GraphX
PDF Full Text Request
Related items