| Most of the networks are huge in reality,which makes the scientific research expensive and inefficient on them.Network graph sampling can sample a small subgraph from the original graph to study,which can save resources and improve computing efficiency,and has great significance to data mining of network.Sampling technology over the dynamic graph stream in recent years,which adopts the idea of reservoir and processes each edge only once,has reduced space and time cost greatly.However,there is a common weakness in the existing dynamic graph sampling algorithms,which is that the proportion of nodes with lower degree of subgraph is too large,while the proportion of nodes with higher degree is insufficient,resulting in the under-representation of the subgraph obtained by sampling and influencing further research.Based on the power law distribution of network graph,the problems of existing sampling algorithms,a classification sampling algorithm is proposed in this paper.It uses the idea of classification.The nodes in the graph stream are divided into different categories,and different replacement policies are set for different categories.The selected node must meet the corresponding probability threshold before being replaced in order to improving the proportion of higher nodes in dynamic graph,reducing the distribution error with the realistic graph,improving the representation of the subgraph.In order to measure the performance of the proposed model objectively,experiments are designed on three real networks.The results show that compared with the existing three classical dynamic network sampling algorithms,our algorithm has achieved great improvement in clustering coefficient characteristics,and is closer to the original graph in degree distribution and K-core distribution characteristics.The KS distance is smaller and the effect is more stable with different sampling ratios. |