| Network is a set of nodes that may be interact to each other,which is a formal representation of the complex systems or interacting components.Networks are widely used in the social,bioinformatical,physical,mathematical and computer sciences.Mining statical topology of network is of the most interesting part of network analysis,with special emphasis on recognizing network block structure.Different number of nodes that each block contains and random connections between nodes result in random connections and different number of connections between blocks.Due to the complex of network structure,the anylysis of network block structure is focus on simplex network structure such as community structure or hierarchy sturcture,with many edges linking nodes of the same blocks and comparatively few edges linking nodes of different blocks,limiting the pratical application of network block structure in real world.Because of cursoriness in data acquisition process or the feature of network data itself,it's general phenomenon that networks incorporate abnormal nodes in some degree which are different from the majority of nodes and don't follow regular pattern.Although they account for a small proportion,but can dramatically influence the network property and degenerate the block structure detection result,but people have no idea about how many of outliers contain in network.In this paper,we propose a novel methodology named robust stochastic block model(RSBM)and an effective parameter learning algorithm named RSBML,for detecting underlying block structure that vary in block size and random connections between blocks regardless of the distraction of outliers.RSBM statistically models the connection patterns and diverse network block structure using block connection probability matrix,and treats ouliers as part of data by quantifying randomness.It reflects the disturbance of outliers in block structure detection procedure,consequently remain all kinds of block structure and guarantee the robustness of RSBM.Based on Kullback-Leibler divergence,we rewrite the log-likelihood of observed data and get the expression of the lower bound of log-likelihood and use variatonal basyesian approach to approximate the true posterier of model parameters joint distribution.The estimation algorithm we propose maximize the lower bound of loglikelihood to get the distribution of model paraters and outlier vector,thereby detecting the underlying block structure.In addition,through extensive experiment on synthetic networks in comparison with other six algorithms,we verified the structure detection ability,flexibility,roubustness and scale of RSBM.Utilizing real-world networks that collected from differecnt fields with different structure and scale to further verify the structure detection ability of RSBM and its value of pratical application. |