Font Size: a A A

Storage Optimization And Efficient Arbitrary Version Switch On Billion Scale Dynamic Graph

Posted on:2021-12-03Degree:MasterType:Thesis
Country:ChinaCandidate:T W YingFull Text:PDF
GTID:2480306104488254Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Graph play a vital role in life now.In the field of big data,more and more applications abstract information into graph and use graph algorithms to process the graph data,such as social network graphs,road graphs,and routing graphs.The representation of real data separates the representing of data from processing.As relations gradually changes over time,the corresponding graph data gradually evolves into multi-version graph data.How to store and randomly access multiple versions of graph has become a new problem in graph processing.For the processing of multi-version graph,existing methods either cause massive storage space overhead and a high degree of storage redundancy when the scale of the graph increases.or takes a lot of time for version switching.After analysis,it is found that the existing multi-version graph processing has the following characteristics: 1)During the evolution of the graph,the high degree vertices has a greater impact on memory.These vertices not only occupy most of the edges of the entire graph,but also have a significantly higher tendency to change.The low degree vertices are less likely to change.2)In terms of access frequency distribution to all versions,some versions are accessed more frequently than others.Optimizing access to these versions may greatly improve the overall average access speed.Based on the above observations,Pensieve,a tilt-aware multi-version graph processing system,is proposed.Pensieve mainly improves processing efficiency from two aspects.First,Pensieve uses a differentiated graph storage strategy,using the replication method for low-level points to retain its accessibility,while using the differential method for high-level points to reduce memory overhead.In multi-version graph processing,this hybrid strategy makes a good trade-off between storage overhead and version switching time.Secondly,thanks to the time locality of the accessed version in graph processing,Pensieve designed and implemented a novel forward switching scheme,placing the version with higher access frequency in a place that is more easily accessible,significantly improving the most recent version Access efficiency.Pensieve also has several small optimization points.Experimental results on real data sets show that Pensieve performs significantly better on multiple data sets than other programs.It leads the current best solution in terms of multiple indicators such as average access time and memory usage.
Keywords/Search Tags:Multi-version graph storage, graph version switching, frequency skewness
PDF Full Text Request
Related items