Font Size: a A A

Mining and Querying Community Structure in Massive Networks

Posted on:2015-05-26Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Huang, XinFull Text:PDF
GTID:2470390017993791Subject:Computer Science
Abstract/Summary:
Graph has been widely adopted to model complex networks in nature and human society. Community structures, as functional building blocks, exist in many real-world networks, for example, social networks and biological networks. Thus, mining and querying community structure in massive networks is of great importance for a deeper understanding and better management of such networks. However, the massive graph volume (million-scale or even billion-scale) and rapid evolution present huge challenges, which need highly efficient solutions. This thesis studies three different problems on mining and querying community structure in massive networks, and designs efficient and scalable solutions.;The first problem is online community search in massive networks . Given a query vertex in a graph, the problem is to find meaningful communities that the vertex belongs to in an online manner. A novel community model is proposed based on the k-truss concept, which brings nice structural and computational properties. A compact and elegant index structure is designed to support the efficient search of k-truss communities with a linear cost w.r.t. the community size, which is optimal. In addition, the k-truss community search problem is further investigated in dynamic graphs setting with continuous insertions and deletions of graph vertices and edges.;The second problem is top-k structural diversity search in social networks. The structural diversity of a person is defined as the number of communities in his/her social relationships, which shows as an important factor in the social contagion process in a recent study. Thus a fundamental problem is to find k individuals who have the highest structure diversity values in a social network. An efficient search framework, equipped with an incrementally refinable upper bound for pruning search space, is proposed to solve the top-k problem. Several heuristic search strategies are also developed for further speeding up the structural diversity search.;The third problem is semi-supervised clustering of graph objects, that is, given a set of graph objects, and the supervision information in the form of pairwise constraints of must-links and cannot-links, how to automatically partition these graph objects into disjoint clusters. We propose to use discriminative subgraph patterns, which are common community structures in the graph objects as the clustering features, and design an objective function which incorporates the constraints to guide the subgraph feature mining and selection process. When the graph objects are represented in a feature vector, we use semi-supervised kernel K-means to cluster all graph objects.;The above three studies are related but with different focuses. For all three studies, we conduct extensive experiments on real-world networks to demonstrate the effectiveness and efficiency of the proposed algorithms.
Keywords/Search Tags:Networks, Community, Graph, Search
Related items