Font Size: a A A

Research On The Methods Of Community Detection Based On Complex Network In Cloud Platform

Posted on:2018-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:H J WangFull Text:PDF
GTID:2310330515471036Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Complex network is a kind of large-scale networks with dynamic characteristics and complex topological structure.Different from random networks,it has small-world,scale-free,super-family characteristics which can be used to guide the study of complex network.A community is regarded as a significant structure in complex network.The nodes in the same community have similar features and are densely connected.Oppositely,the nodes in the different community have diverse feartures and are loosely connected.Mining community information can be more profound understanding of network structures.However,with the rapid development of Internet and mobile terminal technology,the scale of complex network represented by virtual social network is expanding rapidly.The traditional serial partition method is inefficient and can not meet the requirement of real-time processing when it is used to handle the partition of the huge graph.In recent years,cloud computing has become a new technology and service model to deal with massive data which performs well in providing computing and storing resources with high scalability,high reliability and complete fault-tolerant mechanisms.Hadoop and Spark are two of the most widely used platforms.Both of them have some special advantages.Hadoop is good at batch processing of large-scale data and Spark is more suitable for iterative jobs.This thesis combines the two platforms with the community detection project.It consists of the following works:1.A framework is presented to deal with static community detection based on cloud computing platform.The framework includes three steps:coarsening in parallel,detecting in parallel and decoarsen-optimizing.Specially,triangles are chosen as the coarsening source to coarsen the graph on the Hadoop platform.Experiments show that the preprocessing of coarsening may reduce the scale of graph without davastating origin structures and the decoarsening may further optimize the partition results.2.The multi-level parallel community detection algorithm based on the Hadoop platform is developed.The algorithm includes two procedures:Q-optimization and level-mergence.The best solution of divisions at some level is obtained through Q-optimization.Level-mergence is responsible for merging the obtained structures to form the topology of the next layer.The two procedures are repeated until no higer global modularity is obtained.The experimental results show the reliability of the proposed algorithm and its speed advantage in dealing with large-scale datasets.3.An incremental learning algorithm of community detection is presented aiming at the problem of frequent small-scale changes in complex networks.It takes consideration of four kinds of graph changes including increase of edge,decrease of edge,increase of single node and increase of a batch of nodes.To improve the detecting efficiency,we make use of Spark platform to handle the increase of a batch of nodes.Experiments show that the algorithm is able to improve the speed by serveral times and almost does not reduce the modularity.4.A visualization system is established to display the community detection results.The system could paint the uploading original graph and detection results into force-graph by D3 tool and display it in the browser with some information attached,e.g.,node number,link number,average degree and modularity.
Keywords/Search Tags:Community Detection, Hadoop, Spark, Incremental Learning, Visualization
PDF Full Text Request
Related items