| Complex networks are distinguished from conventional networks due to their complexity of structure and uncertainty of node interaction.The research and analysis of complex networks widely exist in various academic fields,and community discovery using statistical methods as tools is a hot topic in complex network research.The research subjects of early scholars were static networks without time variation.With the development of scientific and technological means and the proposal of relevant time perception methods,the research content gradually turned to dynamic networks with more scientific research value.The continuous extension and expansion of dynamic concepts enriched this research field Among many methods,various statistical models can be well combined with the network to draw meaningful conclusions,which are favored by researchers and attract more scholars to participate in the research The network structure in community discovery is mainly composed of nodes and edges,including node attributes and edge link information.According to the community membership of nodes and the link mode of edges,the modeling can be further refined.After processing the data sets that meet the model requirements,the community discovery results under node clustering can be obtained through the constructed model,and the analysis can be carried out in combination with the actual situation.The model construction method of this thesis is based on the Mixed Membership Stochastic Block Models(MMSB),combining to the idea of "instantaneous linking" in recurrent events,and combining the panel counting model under discrete observation to analyze and study the dynamic complex network with time attributes,This is also the innovation of the article in the model.Under this model,the recurrent link data interacted by nodes in the network is brought into the semi parametric ratio model with covariate information.The strength function is in the form of product,and the network is a directed network with mixed membership of nodes,the node is not just a simple single community attribute,but has a composite community membership.Its setting is more consistent with the reality than that of nodes belonging to a single community.This difference will require an increase in the amount of data and computing power of computing equipment.The model combines recurrent event model and network model to form a new dynamic network model under discrete observation.Some of the parameters follow the estimation method of the original model,and the remaining parameters are mixed with each other,so it is necessary to use the circular judgment method with multiple nesting to obtain parameter estimation There are unobservable potential variables in the model.If the traditional maximum likelihood estimation method is used to calculate the marginal probability,the form of the derivative results will be too complex and difficult to solve.Therefore,the conditional likelihood function for establishing links for recurrent event parameters is estimated using expectation maximization(EM)method and estimation equation,and the variable expectation maximization(VEM)method is used for parameters in the network,The potential variable is replaced by the free variational parameter,and the updated form of the corresponding parameter is derived from the likelihood function using the mean field theory and constraint conditions.How to divide communities more accurately has always been a key point in network community discovery.One of the reasonable ways is to add node attributes to the model to cover more node information.The gender attribute of female students is large,which may lead to behavioral differences.The position level of the department has an impact on the number and frequency of email exchanges.However,because such information involves personal privacy,the article will take the hyperlink text emotional attribute as the covariant attribute in the model,and the model results in a conventional positive and negative emotional conclusion.In order to test the clustering effect of the model,the thesis first tests by simulation Repeat the simulation for 100 times under three sample sizes and four intensity functions,respectively,to obtain the effect of network community division and the corresponding parameter estimation results This thesis uses the Adjusted Rand Index(ARI)to evaluate the performance of the proposed model and method for community clustering effect and draw a box graph to show it intuitively.With the increase of sample size,various estimation effects are more accurate.In the case verification part,the data set uses Reddit hyperlink network data.First,K-means algorithm is used to obtain the initial value of the model.For the problem of community number selection,the Bayesian Information Criterion(BIC)method,which is more suitable for small networks,is used to count the link information data by time segment.After the algorithm runs,the hyperlink text is divided into four types of communities,and the node mixed membership vector matrix is obtained The corresponding parameters and cumulative risk function are estimated and explained in text.Finally,the thesis summarizes the proposed model and method,reflects on the problems encountered in the process of scientific research,and puts forward ideas on how to improve the research content in the future. |