Font Size: a A A

Community Structures And Generative Models Of Complex Networks

Posted on:2010-12-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:N DuFull Text:PDF
GTID:1100360308462199Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Driven by the increased computing power and the emergence of very large database systems, people are now able to use techniques from statistics, data mining, and machine learning to explore questions that could not be addressed before from the massive data collected in various fields of the real world. In the past 10 years, researchers have found that both of the physical systems in nature and the engineered artifacts in human society have essentially networked organizations that are large intricate web of connections between massive interacting entities. These networks often have the'scale-free'property, the'small-world'phenomena, as well as the high'clustering-coefficients'. Recently, due to the pervasiveness of social-computing based applications, community structure mining and analysis in networks has been becoming a hot interdisciplinary research topic.The main interest of our research has been in understanding the structural properties and patterns of the underlying microscopic community structures in real world networks. We seek to answer the questions like, 'How can we mine the meaningful community structures efficiently in large complex networks?','What patterns do these communities hold?', 'How do they evolve over time?', and'How do we apply the learned experience and knowledge to the Telecom industry?'etc. The main contributions of this dissertation include the following: 1. We propose the first parallel algorithm Peamc to enumerate all the maximal cliques in large-scale complex networks, which is efficient enough to meet most of the engineering challenges. Experimental results show that Peamc outperforms existing algorithms in many real-world networks.2. We present the first clique-based algorithm ComTector to mine the community structures. For the networks whose community structures are known before, ComTector is able to give the correct results, while for the networks whose structures are otherwise hard to understand, ComTector can detect meaningful communities with much higher efficiency than most of the existing competitors. For the discovered communities, we further propose the first community-profiling algorithm ComResume by combing a node's natural attributes and its network topology to describe the constitution of these communities. Tested by the use-cases of the Scientific Collaboration Network in BUPT, and the Personal Communication Networks, it is shown that ComResume can help people in a good understanding of the network structure, and the community formation. In addition, we also bring forward a community-based algorithm Sketcher to mine the backbone of a given network, which can be used to compress and visualize the network organization in an intuitive way.3. We propose an efficient algorithm COCD to mine the overlapping community structures in large-scale networks. COCD does not require any priori knowledge about the number or the original partition of the network. Rather than relying on the contents of the interactions, it uses the network topology only. Experimental results from numerous real-world networks show that COCD outperforms the state-of-the-art competitors both in effectiveness and in efficiency. Besides, we design the first kernel-based algorithm ComTracer to trace the evolving paths of different communities in a series of network snapshots. We show that a community's lifetime and persistence depend not only on its internal structure, but also on the constitution of the network. Finally, we propose a new algorithm BiTector based on COCD to find the community structures in bipartite networks of the real world.4. We are the first to find the surprising patterns that maximal cliques follow, and to report the power-laws that the edge weights of the cliques hold. The discovered patterns are stable and persistent in several, diverse networks, based on which we further propose the first utility-driven graph generator PaC to model the weighted time-evolving networks. PaC uses a well-defined utility function instead of any randomness to guide the behavior of each node. Only based on local information, it still can generate graphs that follow all of the old and the new statistical patterns.
Keywords/Search Tags:Complex Networks, Social Network Analysis, Community Structure, Graph Generator, Parallel Algorithm
PDF Full Text Request
Related items