| In the real world,the interaction between entities can be abstracted as a complex network through nodes and edges.Community is one of the important topologies in complex networks.The nodes in the same community are closely connected or have high similarity,while the nodes in different communities are sparsely connected or have low similarity.Traditional community detection algorithm is based on the idea of partition.Every node in complex network is assigned to a community,and there is no intersection between communities.However,in the real complex network,some nodes may present the characteristics of outliers,it is difficult to divide them into a suitable community,and some nodes may belong to multiple different communities at the same time.To solve the above problems,in the process of community detection,we allow data nodes with outliers not to be assigned to any community,in order to mine communities with higher accuracy;meanwhile,allow some data objects to be assigned to multiple communities,so that a community can recall as many nodes as possible.This thesis proposes a R-Means algorithm to solve the community detection problem from the perspective of mutual information,the main works are as follows:(1)Kullback-Leibler Divergence Ball(KL-Ball)is used to define a community,the distance between a node and its centroid is computed by using KL divergence.The community is described from four aspects: centroid,radius,mutual information and density.The centroid determines the position of the community in the network,radius describes the coverage of the community,mutual information measures the consistency of the nodes in the community,and density reflects the number of nodes in the community.(2)On the basis of KL-Ball,an R-Means objective function for community detection is proposed.Given a radius,the objective function expects to find communities with low information and high density from complex networks.Low information makes the nodes within the community have strong consistency,and high density makes a community contain as many node objects as possible,so as to ensure the quality of the mined community.(3)This thesis proposes an R-Means algorithm for community detection,which uses the sequential “draw-and-merge” strategy to optimize the objective function.By learning the centroid of the community and combining with the community radius R,the algorithm can find both high-precision non-overlapping communities in complex networks and overlapping communities in complex networks.(4)The convergence of R-Means algorithm is proved theoretically.The algorithm can converge to a local optimal solution of R-Means objective function in limited steps.The experimental results show that the R-Means algorithm for community detection can not only effectively discover the community partition pattern of nodes in complex networks,but also discover high-precision communities,and can also mine overlapping communities in networks. |