Research On Cluster Enumeration Algorithm For Large-scale Bipartite Graphs

Posted on:2020-07-24

Degree:Master

Type:Thesis

Country:China

Candidate:Y He

Full Text:PDF

GTID:2370330590478653

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Data mining is a practical subject.It applies specific solutions according to specific problems,finds the rules in the complicated data,and provides decision-making assistance to the researchers.Graph theory is a discipline that studies the connection of things in the objective world.The combination of the two is to hope to dig out the objective laws of the data in the objective world through specific programs.In today’s society,more and more data can be abstracted into a network structure,so there are more and more data mining problems in the field of graph theory.Among them,there are many data mining problems in the field of bipartite graphs.A lot of research results have been obtained.The main focus of this paper is on the bipartite and maximal bicilque enumeration.Biclique is a subgraph structure in the bipartite graph.Maximal biclique enumeration is of great significance in the real society.It can be applied to many fields,for example,purchase trend prediction,statistical analysis of social networks.To explore some interesting structures of protein interaction networks,e-commerce sites,and other applications.In order to make the research content more practical,the author first analyzes the characteristics of the actual bipartite graph data,and finds that it has the characteristics of large scale and sparse data.These features will play an important role in algorithm design.Among the previous research results,some work did not pay attention to these characteristics.In solving the problem of maximal biclique enumeration,the best method in the previous research results is a parallel solution mrMBEA based on MapReduce framework,which has good scalability and speedup,but at the same time,this solution also has certain defects,such as not using the structural features of the sparse bipartite graph,biasing the task amount estimation,and the like.So this paper designs a parallel solution with better effect.Firstly,this paper starts with the serial algorithm,then parallelizes it to achieve the purpose.After reviewing the data,it is found that there is a kind of iteration based on it.The recursive serial algorithm iMBEA has a significant effect in solving the problem of maximal biclique enumeration,but it also has many shortcomings,such as not analyzing the structural characteristics of the sparse bipartite graph,redundant algorithm process,and it is not easy to parallel.In this paper,a new serial algorithm sMBEA is designed.The performance advantages of the algorithm are also verified by experiments.Based on sMBEA,the parallel solution psMBEA is obtained by a dynamic load balancing strategy and a shared parameter storage structure.The experimental results show that its performance is more efficient than the traditional mrMBEA algorithm.

Keywords/Search Tags:

Maximal Bicliques Enumeration, Graph Data Mining, Parallel Algorithms, Sparse Bipartite Graph

PDF Full Text Request

Related items

1	Research On Algorithms For Dense Subgraph Mining In Large Sparse Graphs
2	Research On Several Distributed Graph Processing Algorithms And A Unified Graph Programming Framework
3	Research On Bipartite Graph Matching Problem
4	Bipartite Matching-extendability Of Series-parallel Graphs And Chromatic Number Of P3(G)
5	Research On Data Query Optimization Technology Of Graph Database
6	Motif Detection Algorithm On Network Graph
7	Research On Algorithms Of Cohesive Subgraph Mining In Large-Scale Graphs
8	Parallel Algorithms And Preconditioning For Large Sparse Linear Sytems
9	Selecting The Optimal Group For Multi Valued Attribute Bipartite Graphs:Research And Implementation Of Skyline Bicliques
10	Parallel Mining Framework Of Large-Scale Graph