Font Size: a A A

Research On Clustering Algorithm Based On Graph Representation Learning In Information Network Environment

Posted on:2023-09-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:N LiFull Text:PDF
GTID:1520307097473954Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the progress of society,people have entered a brand-new information age with various information network data onto scientific research,such as Weibo,Twitter,Douban,and DBLP and so on.In face of these large-scale,high-dimensional,multi-model and multi-view complex information network data,the clustering algorithm based on graph represen-tation learning becomes more complex.Although there have been breakthroughs in improv-ing the accuracy of graph representation and clustering performance in recent years,how to establish an effective and scalable graph representation and realize high performance,fast convergence and strong robust clustering algorithm is still quite challenging.In order to solve the above problems,for complex information networks,especially social networks,From the perspective of constructing a more accurate information network graph represen-tation model,this paper realizes the clustering algorithms based on graph representation learning in the information network environment.The main contributions and innovations of this dissertation are summarized as follows:1.Aiming at the problem that the traditional graph construction method can not ef-fectively capture the internal topology of complex social networks,the paper realizes a community discovery clustering algorithm based on the similarity graph of complex social networks.With the increasing popularity of social network applications,social networks are becoming more complex in structure and larger in size.This community discovery algorith-m is improved on three aspects of social network scale,similarity calculation and cluster center measurement.Firstly,Multiple Similarity calculation Method(MSM)is proposed,which captures the local topological structure as well as the vertex features of large-scale social networks based on six types of similarity feature in social networks,and then quali-tatively determines the similarity evaluation detection method of complex networks.Then,a clustering optimization algorithm combining MSM and Kmeans clustering algorithm is proposed.Finally,experiments are conducted to demonstrate that the Modularity,Sum of square error and Density performance of the proposed algorithm is significantly better on Pblog and DBLP data.2.Aiming at the problem of current community discovery clustering algorithms for complex social network assumed that all entities are of the same type,the paper realizes community discovery clustering algorithm based on non-negative matrix decomposition for multi-model social networks.To achieve a more accurate research on multi-model social networks,this paper proposes a graph Multi-similarity Regular Tri-factorization clustering Algorithm(MRTA).First,three key characteristics of clustering indicators in multi-model social networks are investigated.Then,a generalized clustering model based on multi-model networks is proposed,which introduces both same-mode clustering similarity rela-tion and different-mode clustering similarity relation simultaneously.In view of the effect of inter-model noise on clustering,the regularization method l1is adopted to reduce the influence of non-consistent correlation between clustering and improve the robustness of MRTA.Finally,the experimental analysis of correctness,robustness,and convergence and the theoretical analysis of correctness,convergence,and complexity are performed on sim-ulated data and various social network data.According to the experimental and theoretical results,the clustering effect,robustness and convergence of MRTA are proved.3.High-dimensional sparse vector representations of semantic information networks are the main problems affecting the performance of spectral clustering.This paper study the spectral clustering algorithm based on similarity graph and nonlinear graph embedding representation.this paper studies and improves the spectral clustering algorithm from two aspects of similarity graph construction and low-dimensional vector feature representation.First,a large-scale network similarity graph S based on landmark nodes is proposed.Then,a nonlinear target model for spectral clustering is proposed.In addition,a Sparse Encoder Graph Spectral Clustering(S EG S C)is proposed.Through theoretical and experimental analysis,the effectiveness and scalability of the algorithm is verified,Experiments show that the clustering accuracy and mutual information entropy of the algorithm are higher than those of the comparison algorithm.4.Aiming at the problem that the high dimension,non-labeling,and redundancy of multi-view semantic information network affect its clustering effect,this paper studies an unsupervised feature selection clustering algorithm based on multi-view networks.Cur-rently,networks of multi-view heterogeneous feature space representations are ubiquitous.Multi-view networks can not only synthesize these view spaces,but also accurately repre-sent the semantic information.This paper proposes a Multi-view Feature Selection-based Clustering algorithm(MFSC),which incorporates similarity graph learning and unsuper-vised feature selection of multi-view networks.First,local manifold regularization is inte-grated into the similarity graph learning.Meanwhile,the clustering labels of this similarity graph are used as a standard model for unsupervised feature selection.This model can se-lect features of clustering labels while maintaining the manifold structure of the multi-view network.Then,based on the results of the model feature selection,Kmeans clustering is performed to propose the MFSC algorithm framework.MFSC is subjected to theoretical analysis of complexity and convergence.Finally,the algorithm is systematically evaluat-ed on the benchmark multi-view network.The experimental results show that the MFSC algorithm has better clustered performance compared with the traditional algorithms.
Keywords/Search Tags:information network, social network, graph representation, clustering algorithm, multi-model social network, multi-view semantic information network
PDF Full Text Request
Related items