Research On The Similarity Search Problem In Graph Databases

Posted on:2022-04-05

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Li

Full Text:PDF

GTID:2510306323451244

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

At present,the field of Internet is developing rapidly,especially the complex and closely related data is growing by a large margin.These complex and interrelated data are usually represented by graph model.With the wide application of graph,efficient graph data management technology is becoming more and more important.Graph search operation is frequently used in large-scale graph database retrieval,so it is widely concerned.Graph similarity search is a common graph database operation,the most commonly used similarity measure is graph edit distance(GED).It is suitable for any type of graph and can capture the structural differences between two graphs accurately and effectively.In this paper,the graph similarity search problem using graph edit distance as similarity measure is deeply studied.Specifically,given the graph database and a query graph,the user gives the threshold of the editing distance of the graph,and the graph similarity search is in the graph However,the search based on graph editing distance is challenging in large graph database,which has proved NP hard.Most of the existing graph similarity search methods use the filter verify framework.Roughly calculate the lower bound of the edit distance between the query graph and each graph in the given database.If the lower bound of the edit distance is greater than the threshold given by the user,the data graph will be filtered out.On the contrary,the edit distance of the obtained graph is less than or equal to the similarity threshold set by the user,and the qualified graph is taken as the candidate set.Finally,the candidate set is verified.Aiming at the problems of graph partition overlap,index and approximate calculation in graph similarity search,this paper proposes a new filtering method and verification method to further improve the query performance of similarity search algorithm in graph database.The specific work of this graph similarity search method can be summarized roughly as follows:In the first stage,a graph similarity search method based on incremental partition is proposed to overcome the shortcomings of most algorithms in graph partition.It uses the idea of q-gram to enhance partition effect,quickly filters some graphs,and extends the traditional A* algorithm to verify the graphs in candidate sets.Two real data sets,AIDS and protein data sets,were used in the experiment.The results show that the graph similarity search method based on incremental partition is effective.Compared with the existing similarity search methods,similarity search method based on incremental partition has a great improvement in the comparison of response time and candidate set size between verification and filtering.In the second part,this paper proposes a graph similarity search method based on clustering tree index.Firstly,this paper optimizes the traditional k-means algorithm which must initialize k,and constructs a corresponding tree index for each cluster.For a given query graph,if it shares few graph partitions with a cluster center,the corresponding tree index of the cluster will be in the cluster Most of the subtrees will be pruned to quickly filter out a large number of graphs,thus speeding up the search time.The experimental results on two real large-scale data sets AIDS and NASA show that the performance of the algorithm is better than the existing algorithms in index construction time,response time and filtering time.In the third part,aiming at the shortcomings of most graph similarity calculation,agraph similarity search method based on graph neural network is proposed.By using theidea of graph isomorphism network and neural tensor network,the approximate value ofgraph editing distance can be found quickly.The experimental results show that this model is better than othermodels in time.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Research On Similarity Search In Large-Scale Graph Databases
2	Research On Single Cell Clustering Based On Graph Similarity Learning
3	Similarity Judgment Of Linear Arrangement Of Buildings Based On Graph Convolution Neural Network
4	Research On Clustering Algorithm Based On Graph Structure
5	Research On Graph Similarity Judgment Method Based On Upper Bound Of Graph Edit Distance
6	The Atomic Bond Connectivity Index Of The Graph
7	Research On Graph Classification Based On Graph Neural Networks
8	Research On Graph Embedding Model Based On Deep Neural Networks
9	Research And Design Of Gene Similarity Search Method Base On Heterogeneous Network
10	Some Special Classes Of Graph Zagreb Index