| With the rapid development of networks and the emergence of social platforms,the scale of social networks has become more huge and complex.Therefore,the social Internet has created a new communication method for users,providing people with an unprecedented amount of data for data mining,and then carrying out user behavior analysis and research.Classification is a hot research field in data mining.Due to the diversity of data structures and the complexity of nodes in social networks,the classification of nodes in social networks has become a research hotspot in this field.In social networks,the category of nodes will be affected by the network structure and the attributes of the nodes themselves.Most researchers often use a single structural factor or a single attribute information as the criteria for node classification.In real networks,there are many factors that determine the classification of nodes and interact with each other.Therefore,this dissertation proposes a research method of social network node classification based on the multi influencing factors in social networks.The main research contents are as follows:(1)Considering the influence of user nodes,neighborhood structure,interest preference and activity index on the category of nodes in social networks.Based on the two levels of network structure and attribute,a label propagation algorithm,IALPA,which integrates topological structure and node attribute,is proposed to classify nodes.GIN strategy,which combines node influence and neighborhood structure,is proposed firstly.The GIN strategy generates node update sequence by ranking nodes’ influence,and guides the initial tag assignment with node neighborhood similarity;Then,a GLP strategy combining neighborhood structure and attribute information is proposed to determine the guiding priorities of the remaining three factors and measure their similarity coefficients to guide tag propagation.At the same time,ILPA and ALPA methods are derived from the number and category of attributes to optimize the corresponding classification process.This dissertation uses real social network data sets to verify the similarity difference and modularity of classification result attributes.The results show that the attributes are similar within the categories,but different among the categories,and there is a strong community structure.(2)In the process of label propagation,too many labels of unimportant nodes may mislead the selection of labels of other nodes,resulting in inaccurate classification results.To solve this problem,a label propagation method LRWLPA for local seed sets is proposed in this dissertation.Firstly,the LIN strategy for pre-selection of seed nodes is proposed.The seed node set is generated for important nodes by random walk sorting method combined with similarity index,and only the seed nodes are assigned labels;Then the LLP label propagation strategy is proposed,which adjusts the label propagation rules according to the seed set and the label update sequence to make the node classification conform to the actual classification standards.This method is tested on real social network data sets to verify the accuracy,stability and community structure of classification results.The results show that the classification accuracy has been improved,the result modularity is also high,and the overall result index is more stable than the label propagation algorithm. |