Font Size: a A A

Research On Stochastic Block Model For Complex Networks Analysis

Posted on:2022-03-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y LiuFull Text:PDF
GTID:1480306728482394Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Complex networks,abstracted from real-world complex systems,have a wide range of research values and application prospects.For example,structural pattern detection is essential for discovering the implied functional modules in complex systems,understanding the potential connections among them,and predicting the dynamic evolution of complex systems.The stochastic block model(SBM)has many advantages in studying complex networks,such as interpretability,expressiveness,generalization and flexibility.SBM has been used in different network analysis tasks,such as community discovery,link prediction,network representation learning,etc.However,real-world networks are sparse,noisy,large-scale,and complex,there are still many problems in the study of complex networks when using SBM.(1)Signed networks are one of the common types of complex networks.However,most existing SBM learning methods for signed networks are unsupervised,leading to poor performance in finding hidden structural patterns,especially when handling noisy and sparse networks.Therefore,can we develop a more flexible model for semisupervised learning that can detect both community and multipartite structures from signed networks?(2)In practical applications,the optimal SBM model needs to be found to fit complex networks in the real world as accurately as possible.However,learning an optimal SBM for a given network is still an NP-hard problem.Therefore,can we propose a fast-learning algorithm for SBM,which has both model selection capability and better performance in network analysis tasks?(3)Attributed network representation learning is one of the hot spots in network science research.Most of the existing methods mainly focus on the local information but ignore the global knowledge,which leads to poor performance on the disassortative networks.Therefore,can we apply the expressiveness of SBM to the attribute network representation learning to handle both assortative and disassortative networks?Based on the above questions,the research work in this paper focuses on three aspects of SBM: model extension,fast learning,and model application.The main contributions of this thesis are as follows.(1)A novel semi-supervised signed SBM and a learning algorithm based on variational Bayesian inference are proposed to discover both the community and multipartite structures in signed networks.Specifically,a new parameter is introduced to describe the relationship between blocks and labels to which nodes belong.The model improves the performance of the model in discovering the hidden blocks in the network using partially known labels.The model uses two vectors to represent the probability of intra-block and inter-block nodes having different types of links,respectively,thus characterizing the kinds of structure in the network.Based on variational Bayesian theory,an efficient semi-supervised learning algorithm is proposed for estimating the parameters and hidden variables of the model.The proposed model is validated through many experiments wherein it is compared with the state-of-the-art methods using synthetic and real-world data.The carefully designed tests,allowing us to account for different scenarios,show our method outperforms other approaches existing in this space.It is especially relevant in the case of noisy and sparse networks as they constitute the majority of the real-world networks(2)The redefined SBM based on Poisson distribution and the block-wise algorithm based on Expectation-maximization(EM)are proposed to deal with largescale networks effectively.Compared with the classical SBM model,this model assumes that the generation of links in the network follows Poisson distribution,which makes the time complexity of the learning algorithm of this model related to the number of edges of the network rather than the square of the number of nodes.Second,the model redefines the process of edge generation: it depends on the link probability blocks and nodes rather than that between blocks.This method removes the components dependency of the hidden variables' posterior distribution;thus,the posterior of the hidden variables can be calculated directly by EM,rather than variational methods or sampling methods for estimating an approximate one.In each iteration of parameter estimation,the “quality” of each block is evaluated,and the unsuitable blocks are deleted.In this way,parameters are estimated,and the proper model is selected at the same time.Extensive validation conducted on both artificial and real-world data shows that our proposed method significantly outperforms the state-of-the-art methods in terms of a reasonable trade-off between accuracy and scalability.(3)A generative model for attributed network representation learning is proposed for assortative and disassortative networks.Specifically,nodes in the network are assigned to blocks in which nodes within the same block have similar linkage patterns.These blocks can define assortative networks with communities or disassortative networks with other structures.To preserve attribute information,the model assumes that each node has a hidden representation associated with the block,while the relationship between the representation of the node and the attribute is nonlinear,which can be characterized by a neural network.Extensive experiments are performed on realworld and synthetic attributed networks.The results show that the proposed method consistently outperforms state-of-the-art embedding methods for clustering and classification tasks,especially on disassortative networks.
Keywords/Search Tags:Complex Networks, Stochastic Block Model, Semi-supervised Learning, Scalable Learning, Network Representation Learning
PDF Full Text Request
Related items