Font Size: a A A

Research And Implementation Of Overlapping Community Detection In Single Large Graph

Posted on:2016-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:T T GuoFull Text:PDF
GTID:2370330542457359Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of information technology,the data scale of social network has been explosively grown;also the complexity of network structure has been increased.Among them,the community structure is a ubiquitous topological characteristic of complex networks,but in real social networks,community structures are often overlapping,for example,a person may simultaneously active in a number of interest groups.Thus,in recent years,mining overlapping community structure of a complex network has become a hot topic for many experts and scholars,but now the problem is still not a perfect solution;at the same time,the current overlapping community detection algorithms are carried into research mostly based on network topology,largely ignore the unique role of attribute information to the nodes in the process of mining network community structure,but attribute information for the mining community to a more accurate structure is of great help.Therefore,in order to resolve the existing problems among the current overlapping community detection algorithms,the thesis have been studied and discussed in the following aspects:First,based on the multi-label propagation mechanism in COPRA algorithm,for the existing problems of COPRA algorithm,the thesis proposes an optimization COPRA algorithm based on a new label select strategy called COPRA-NLS,and finally the algorithm is used in the BC-BSP platform.The proposed new label selection strategy introduces a local clustering coefficient of network and recently updated label information,which effectively reduce the randomness in the stage of label update and improving the efficiency of the algorithm iterations.Secondly,based on multi-label propagation ideas,this thesis presents an overlapping community detection algorithm SA-COPRA which will consider topology and attribute information in the social network at the same time,the main work includes:(1)For attributed graph,the thesis designs a framework for taking into both account topology and attributes found in the overlapping community;(2)A graph unified model of topology and attribute is proposed in this thesis,and the model sets different weights for the structure edges and attribute edges of the graph using different calculation methods,then use the merge rules to measure topologies and attributes uniformly,forming the structure-attribute augmented graph;(3)Neighborhood random walk model is adopted to calculate the similarity between nodes in the structure-attribute augmented graph,generating the similarity matrix;(4)Combining structure-attribute augmented graph and similarity matrix,the thesis uses COPRA algorithm to detect overlapping community and introduces the comparative similarity between nodes.Eventually,structures within the overlapping community are close,and attribute information within the community is homogeneous;among non-overlapping community structures are loose,and the attribute information is heterogeneous.Finally,there is a discussion of the methods and techniques adopted in the process of the algorithm implementing at BC-BSP platform have a discussion,by optimizing BC-BSP system global gathering module,it can effectively further improve the extensibility of the algorithm if we use HDFS to store the aggregate value in the realization.By testing on actual network data sets and neural networks data sets,experiments show that in the large-scale social networks,the two proposed algorithms can dig out of high-quality overlapping communities within the valid period.
Keywords/Search Tags:Overlapping Community, Label Propagation, Clustering Coefficient, Attributed Graph, Neighborhood Random Walk Similarity
PDF Full Text Request
Related items