Font Size: a A A

Study On Self-Supervised Clustering Algorithm For Fusion Graph Structure Information

Posted on:2024-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:X Z ChenFull Text:PDF
GTID:2568307118987199Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Clustering analysis completes the clustering of samples based on different metrics.Traditional clustering methods have high computational complexity,unstable results,and poor clustering performance for high-dimensional data with complex structures.In recent years,deep neural network-based clustering algorithms have significantly improved the clustering performance due to their powerful nonlinear mapping ability to obtain more characterization meaningful features.Most of the existing deep clustering methods obtain data features for clustering by deep neural network learning but do not fully utilize the potential association relationships between data.The graph structure contains the information of potential association relationship between data,and researchers use graph neural network to obtain the structural representation information based on graph structure learning to achieve clustering,while a few algorithms simply splice the data features and structural representation for clustering analysis,which does not achieve the deep advantage of two kinds of feature information to complement each other.To address the above problems,this thesis first extracts data features by autoencoder,models graph structure learning to obtain structure representation information,then fuses data features and structure representation for clustering analysis and finally constructs a self-supervised mechanism to optimize the overall network.The main work and contributions of this thesis are as follows:To address the problem that the combination of data features and structural representation in existing clustering algorithms does not reach the advantageous complementarity of the two feature information,data features are extracted by s autoencoder,structural representation is obtained by designing graph autoencoder learning based on graph convolutional network,and adaptive fusion network is constructed to fuse the two feature information complementarily layer by layer for clustering.To address the problem that data features and structural representation extraction and adaptive fusion network are relatively independent and lack a unified target distribution to guide the overall network training,we construct a joint multidistribution self-supervision mechanism,use the output of autoencoder and graph autoencoder to design a unified target distribution to jointly optimize the extraction and fusion of data and structural features,and then realize the cooperative training of autoencoder,graph autoencoder,and adaptive fusion network.Experimental results on six public datasets show that the self-supervised clustering method based on graph structure fusion can effectively improve the clustering effect of different types of data.To address the problem that over-smoothing occurs when existing clustering algorithms based on graph structure data using deep graph neural networks to learn structural representation information because the network is too deep,we implement contrastive learning based on graph structure and design graph contrast loss to incorporate structural representation information into the data features extracted by multilayer perceptron and autoencoder,so that the positive sample node features are closer and farther from the negative sample node features in the potential feature space The algorithm no longer relies on graph neural network learning to obtain structural representation information.To address the problem that the graph model relies on the adjacency matrix to guide node feature aggregation to obtain structural representation information,the adjacency matrix is only used to define the connection strength between nodes to distinguish positive and negative sample nodes in the graph structure for contrastive learning before model training,and the adjacency matrix is no longer used to guide the learning of structural representation information during model training,and the self-supervised mechanism is used to optimize the iterative overall network.Experimental results on six public datasets show that the multi-feature self-supervised clustering method based on graph contrastive learning can obtain better clustering performance with a simpler network structure and less resource consumption,and the robustness of the algorithm is more desirable.This thesis has 20 charts,12 tables and 105 references.
Keywords/Search Tags:clustering, graph structure, self-supervised, contrastive learning, autoencoder
PDF Full Text Request
Related items