Research On Compact Storage Technology Of Knowledge Graph

Posted on:2023-05-15

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Zhu

Full Text:PDF

GTID:2568307073491174

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Knowledge graphs can provide important support for semantic retrieval and reasoning methods.In recent years,the scale of knowledge graph data has experienced large-scale growth.With the increasing demand for offline processing capabilities of knowledge graph data,the application side requires the storage system to have higher read-write efficiency and retrieval performance.In this regard,this thesis proposes a semi-static knowledge graph storage system optimized for batch construction,and realizes hot data partitioning and horizontal expansion at the same time;then realizes compact encoding of data through selfindexing compression technology,and further optimizes its storage scheme and data reorganization.method.Firstly,an optimized storage mode for data batch construction is proposed for semi-static knowledge graphs.By realizing an intermediate representation of graph data between the data warehouse and the graph database,the problems of complex data processing links,untimely data updates and data hot spots are alleviated.First,the attribute graph model is used to model the knowledge graph,and the one-hop subgraph is used as the basic graph storage unit;then the sharded storage of dense edges and the global ordered partition of data are realized by the strategy based on random sampling;finally,by supporting the full amount of data Build and two asynchronous data reorganization modes to achieve the update of static inventory data and the timeliness of incremental data.Secondly,a graph data storage method based on self-indexing compression technology is proposed,which aims to realize the dedicated compression and storage of graph index and attribute data.Firstly,based on rMM-tree and depth-first compact tree coding,and based on the concise data structure,DFUDS Trie is proposed as a compact representation structure of one-hop subgraph index and point-edge attribute,and the storage and indexing of the underlying bit field is realized by using compressed bit vector;and then use a compact representation-based enumeration array to achieve simplified storage of ordered attributes and one-hop subgraph bidirectional relationships.Compared with general compression methods,it can ensure the compression rate of graph data and greatly reduce the deserialization overhead on the premise of sacrificing some coding and retrieval efficiency,realize selfindexing compression of data,and improve the cold start efficiency of the system.Finally,an efficient merging method of multiple DFUDS Tries is proposed.This method is based on the characteristics of compact storage of graph data,and aims to further optimize the multi-channel data reorganization process.This method utilizes the physical continuity of subtrees encoded by DFUDS,and implements a dynamic memory swap-in and swap-out strategy for DFUDS Trie in the process of merging.It can dynamically return memory and persist nodes according to the progress of merging,which can better optimize graph data.In the reformation process,the temporary space is too large due to the merging of multiple DFUDS Tries.To sum up,this thesis first proposes a semi-static knowledge graph storage system,which optimizes the knowledge graph data structure and storage method;then further introduces selfindexing compression technology in it to realize the dedicated compression of graph data;finally,a multi-channel The merge method solves the problem of excessive memory overhead during the data update process.

Keywords/Search Tags:

Knowledge Graph Storage, Distributed System, Succinct Data Structure, Indexable Compression

PDF Full Text Request

Related items

1	Research On Distributed Storage And Retrieval Technology Of Large-scale Knowledge Graph
2	Design And Implementation Of Knowledge Graph Storage Access System Based On Big Data Platform
3	Distributed Storage Framework Design For Heterogeneous Data Reliability Requirements
4	Research And Implementation Of Data Compression In Distributed Storage
5	Research On Key Techniques Of Distributed Data Processing And Storage
6	Study On Distributed Compression Storage Optimization Based On RCFile Storage Model
7	Research On Financial Time Series Knowledge Graph Data Storage System
8	Research And Implementation Of Academic Knowledge Graph Based On Natural Language Processing Technology
9	Research On The Construction And Visual Interaction Of Data Structure Knowledge Graph
10	Design And Implementation Of Visual Analysis System For Knowledge Graph