Font Size: a A A

Research On Dynamic Programming Based Large-scale Network Online Clustering

Posted on:2019-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2370330566480053Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Many complex systems can be abstracted into networks or graphs for research.At the same time,the development of complex network science has also greatly promoted understanding of complex systems for people.In the process of research on complex networks,many important properties have been discovered,such as self-similarity,attractors,small-world,scale-free,and community structures.Among these properties,the community structure is one of the most important property in a complex network.It plays a decisive role in the research of the structure and function of complex networks.It also has very important significance in many fields such as sociology,physics,biology,computer science,etc.Graph clustering is able to identify community structures or clusters of closely connected nodes in a complex network,such that there is a higher density of links within clusters than between them.As people go on deeply researching the community structure,a large number of excellent graph clustering algorithms have been proposed and developed.Nowadays,with the rapid development of social networks and big data,the size of the network is growing as well,and many networks even contain millions of nodes and billions of edges.How to achieve large-scale network clustering has become a research bottleneck in this field.Based on this,this thesis first briefly introduces the research background and significance of graph clustering,and points out the important progress in this research area.Then,the thesis briefly introduces related basic knowledge including the definition and characteristics of complex networks,the definition of community structure,and the evaluation index of graph clustering.Next,the thesis has conducted in-depth research on large-scale network clustering.First of all,from the optimization of general graph clustering algorithm,a clustering algorithm by fast find central nodes in networks is proposed in the thesis,which is inspired by the recently proposed clustering algorithm via fast find and search density peak and the scale-free property.The algorithm achieves fast clustering by quickly searching and detecting clustering centers in a graph.On the other hand,in order to improve the general clustering algorithm scalability to adapt to large-scale networks and meet real-time requirements at the same time,the thesis proposes a dynamic programming framework for large-scale online clustering on graphs(DPOCG).The framework greatly reduces the time-varying network size by constructing super-nodes and removing remote nodes,so that the general graph clustering algorithm can be applied.The main research works of this thesis are shown as follows:(1)This thesis proposed a clustering algorithm by fast find central nodes in networks.The algorithm first calculates the local density ? and the integrated-distance ? of each node,and then the local density threshold thrho and the integrated-distance threshold thbeta of detecting cluster centers are quickly determined by the decision map drawn by ? and ?,and cluster centers are detected.Finally,other nodes are assigned to clusters that the closest cluster center is located to.Our experimental results on real-world networks show that the clustering effect of the density-based graph clustering algorithm is significantly better than the comparison algorithms.(2)This thesis proposed a dynamic programming framework for large-scale online clustering on graphs.The framework is based on the idea of dynamic programming and combines the characteristics of network evolution.It divides large-scale networks into sub-networks in several stages according to a certain period of time.Then,the size of the sub-networks in each phase is reduced in two ways.One is that constructing super-nodes for nodes in the same cluster which status has not been changed in the adjacent phases,and removing remote nodes.Finally,the general graph clustering algorithm ? is used to cluster on the reduced-scale network,and the removed nodes are quickly clustered through the nearest neighbor strategy.The fast and efficient DPOCG framework solves large-scale network clustering well and improves the scalability of general clustering algorithms while meeting real-time requirements.Experiments results on synthetic BA networks and real-world networks demonstrate that the DPOCG framework is effective and has obvious advantages over other comparative large-scale network clustering algorithms.
Keywords/Search Tags:complex network, community structure, scalability, online clustering, large-scale network clustering
PDF Full Text Request
Related items