Font Size: a A A

Community Detection In Complex Networks Using Density-based Clustering Algorithm: Method And Application

Posted on:2017-07-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:T YouFull Text:PDF
GTID:1310330536468090Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The theory of complex network offers a simple yet effective approach to model the complex systems from different disciplines,and it has been highly concerned by academic circles in the past ten years.The complex systems from different disciplines are seemingly widely different,but some common features,such as power-law distribution of the nodes' s degree?small world and community structure,will gradually arise in the process of development of those systems.Complex network can acurately describe the key features of the real world systems due to it conquers the traditional reductionism's disadvantage of studying the objects separagely,and the feature of community structure is the start point of this dissertation.On the one hand,the community structure of complex networks can correspond to the naturally formed subsystems of real world complex systems,for example,protein cooperation network can form functional modules correspond to different organs,scientist collaboration network can form various research groups correspond to different disciplines,social network can form clubs correspond to different interests and hobbies.On the other hand,mainstream knowledge about community structure of complex networks still maintained intuitively,and in the academic circles,scholars just consider the community structure as a set of nodes which connected with each other with a higher probability.And meanwhile,there still lots of problems about the community detection algorithms,so community detection is still an open topic.Therefore,it is a very important theoretical significance and practical significance to study community detection algorithm.Like clustering algorithm,community detection also belongs to unsupervised learning realm,it aims at assigning nodes in a network into different communities,so it is a promising idea to propose community detection algorithm based on clustering approach frame.Density-based clustering algorithms have natural advantages over traditional clustering algorithms.For example,compared to K-means,DBSCAN don't need the number of clusters as prior information,subject to local opitmal,can recognize the noises,and so on.But the clustering results of DBSCAN are highly depend on the specific value of two parameters,therefore the parameter choosen problem is always a hard problem need to be solved in machine learning.In 2014,a novel density-based clustering algorithm(FDP :Fast Density Peak)was proposed in a paper published in Science,it not only reserves the advantages of DBSCAN,but also delicately conquers the parameter sensity issue,FDP offers a new studying approach for unsupervised learning.Inspired by the idea of the approach,we want to use FDP to detect community structure of complex networks in this dissertation.However,based on the theoretical analysis and numerical experiment,we found that FDP cannot recognize the community center nodes,because network data is embedded in an extremely high dimension.Furthermore,we also found that it is a hard task to recognize the number of communities manually through the decision graph approach when the network structure getting fuzzier or the number of communities is large.Based on the above two facts,we proposed ISOFDP algorithm which is based on manifold learning framework.ISOFDP conquers the dimension curse problem of network data such that the community center nodes can be distinguished from other member nodes significantly.Besides,we proposed an improved partition density function which is the criteria to determine the number of communities automatically,finally ISOFDP can achieve better results than the state-of-the-art algorithms.Furthermore,we generalize the framework of ISOFDP such that we proposed series of community detection algorithms which can detect overlapping commuinties,communities of weighted and directed networks and communities of dynamic networks respectively,and we also achieves better result than the state-of-the-art algorithms.This dissertation includes nine chapters.Chapter 1 introduces the background of the subject and the significance of the research,as well as the frame of the dissertation and the innovation points.Chapter 2 introduces the basic theory and relevant concepts of the complex network,then reviews research literature about various community detection algorithms.Chapter 3 analyze the shortcomings of density-based clustering algorithm FDP in detecting the community structure,then research the improved strategy such that we proposed the community detection algrithm ISOFDP.Chapter 4 proposed fast community detection algorithm which is based on local manifold learning algorithm,including L-ISOMAP,LLE,LE and LPP,they conquer the disadvantage of high time complexity of ISOFDP.Chapter 5 defined membership measures of member nodes,and distinguish the overlapping nodes according to the relative value of membership measures and threshold value ?,thus we proposed the overlapping community detection algorithm ISOFDPOV.Chapter 6 compute the pair-wised similarity of nodes based on signal method,this similarity effectively utilizes the weighted and direction information of edges in the network structure,thus we proposed the ISOFDPDW algorithms which can detect the communities of weighted and directed networks.Chapter 7 utilizes the information of time dimension such that we can model the graph series of all time points based on 3-order tensor,we define this tensor as adjacency tensor,then we decompose adjacency tensor into the node-community membership matrix and community-dynamic behaviour matrix,thus we can get the dynamic community tensor in which we can detect the communities of each time points and the evolution rules based on each longitudinal section of the tensor,finally we proposed Ten FDP algorithm which can detect communities of dynamic networks.Chapter 8 utilizes the daily data of CITIC industry index closing price from June 16,2006 to July 9,2014 to compute the correlation to form the edges of network of Planar Maximally Filtered Graph(PMFG)of CITIC industry index,then use ISOFDPDW to detect communities of this network before,during and after the financial crisis,such that the impact of financial crisis on the community structures of this network can be revealed.Chapter 9 summarizes the whole dissertation and present the shortcomings of the present work,then make the future research work prospects.The main innovation points of this dissertation including the following aspects:Firstly,we proposed a novel community detection algorithm named ISOFDP which is originated from density-based clustering algorithm.ISOFDP doesn't need the number of communities as the prior information,and its performance is better than the state-of-the-art algorithms in three aspects of accuracy,sensitivity of parameters and robustness.ISOFDP preserves the advantages of FDP but also conquers its disadvantages.We found that the density-based clustering algorithm FDP cannot be directly applied to detect community structures of complex networks by theoretical analysis and numeric experiment,because the network data possess the character of high dimensional data,so the pair-wised distances of nodes are highly homogeneous and FDP cannot distinguish the community center nodes correctly.We assume that there exists a low intrinsic dimension which preserves the community structure of the original network,so we proposed to use the state-of-the-art manifold learning algorithm ISOMAP to infer the intrinsic dimensional mapping of the network structure,in order to conquer the homogeneous distance problem.Besides,FDP need humane intervention to select the community center nodes,and we found that it is a hard task to determine the number of communities manually in its decision graph,so we proposed an improved partition density function to determine the number of communities automatically.Secondly,we proposed a new overlapping community detection algorithm named ISOFDPOV.After we obtained a hard-partition of the network by ISOFDP,then we can found the community center nodes and the member nodes of each community.We define the membership measure of those member nodes,and then determine the membership of each member node according to the relative value of membership measure and threshold value?,then the overlapping nodes can be distinguished if member node's membership is not unique,finally we can find the overlapping community structure.Thirdly,we proposed a novel algorithm ISOFDPDW which can detect the communities of weighted and directed networks.We found that the structure similarity can only utilize the binary symmetric matrix to compute the pair-wised similarity of nodes.Thus we propose to use the signal similarity which can effectively utilize the weighted and directed information of weighted and directed networks,then we can compute the pair-wised similarity more objectively,furthermore,we use weighted and directed modular function as converge object.Finally,we test the effectiveness of ISOFDPDW on weighted undirected,unweighted directed and weighted directed networks respectively.Fourth,we proposed a novel community detection algorithm named TenFDP which is based on nonnegtive tensor decomposition model to detect community structure of dynamic complex network.TenFDP assume the community structures of each time slice is not only related with present network structure but also related with community structures of previous time slice.So we utilize 3-order tensor model to depict the evolement of network structure to get the adjacency tensor.Then we decompose the adjacency tensor into the node-community membership matrix and community-dynamic behaviour matrix,thus we can get the dynamic community tensor in which we can detect the communities of each time points and the evolution rules based on each longitudinal section of the tensor.We test the effectiveness of TenFDP on 5 different types of dynamic networks.
Keywords/Search Tags:Complex network, Community detection algorithm, Density-based clustering, Manifold learning, Financial network
PDF Full Text Request
Related items