Font Size: a A A

Study And Improvement Of Overlapping Community Discovery Based On Local Expansion

Posted on:2017-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y LongFull Text:PDF
GTID:2310330503966147Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the network and the wide use of social media, it makes people close than ever. A large social network consists of the complex relationship, which attracts the attention of many scholars who are dedicated to the study of complex networks. Finding and extracting modular structure from complex networks which is called the community detection. Previous scholars worked on non-overlapping networks which in real life is not practical. So the research on overlapping networks has been studied and has made great progress in recent years. In this paper, I propose an improved scheme for the overlapping community detection algorithm.Overlapping community is that nodes belong to more than one community. Effectively identify the overlapping nodes which are the key point of the research of this paper. The well-known algorithms are LFM algorithm and GCE algorithm, they are both using the local information of the network and the concept of the growth of a single seed. This paper proposes a new scheme based on the concepts which consist of the seed selection, community expanded pruning, similarity judgment, parallel model.(1) LFM algorithm selects seed nodes at random, which affects the accuracy of algorithm. GCE algorithm finds all cliques of a network, which affects the efficiency of the algorithm. I adopt a compromise strategy, which is to get the core seed structure by deleting nodes with low influence in the network. Besides, I believe the theory that the points with more degree are more important and if the neighbor node is very important, the node itself is also important.(2) LFM algorithm and GCE algorithm don't judge the candidate set when extending a seed which has seriously affected the performance of the algorithm. In this paper, through detailed derivation and rigorous mathematical proof of the expansion process, I can prune candidate sets after community expansion, to further enhance the efficiency of the algorithm.(3) I must compute the similarity between communities to improve the accuracy of the results. In addition to the consideration of community's node set, this paper considers the influence of the neighborhood node of the community for the similarity measure formula, which has the practical significance.(4) Parallel processing the expansion. Parallelization is an important means to improve the performance of algorithms. Through analysis process of the algorithm, you can easily reduce the data dependency and introduce the producer-consumer model to the algorithm. The experiment will operate on multi-core CPU due to environmental constraints.(5) Through the experiment, It has been proved that seed selection strategies is feasible within a certain range by applying to the real network, and overall algorithms improve the accuracy and reduce the time loss. To measure the accuracy with NMI(Normal Mutual Information), we found that the proposed method is sensitive to mixing parameter and community structure, but it superior to LFM algorithm and not inferior to the GCE algorithm.
Keywords/Search Tags:Overlapping community discovery, node influence, local optimization, parallel computing
PDF Full Text Request
Related items