Font Size: a A A

Research On Heterogeneous Graph Neural Network Algorithm Based On Self-supervised Representation Learning

Posted on:2024-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhaoFull Text:PDF
GTID:2530307064985519Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,the era of big data has arrived,and various data in life are showing an explosive growth trend,many of which can be modeled as graph-structured data.Unlike homogeneous graphs,heterogeneous graphs contain rich semantic information because they have multiple types of nodes and edges,which can model real-world scenarios intuitively and efficiently.Heterogeneous graphs are now widely used in various fields,such as citation networks,biomedicine,and recommendation systems.Analyzing heterogeneous graphs has become an important direction in data mining,and its research is significant.As a relatively new research field,heterogeneous graph neural network has become one of the hottest issues in machine learning and data mining research by applying deep learning methods to heterogeneous graphs.Existing heterogeneous graph neural network approaches use various structures on heterogeneous graphs,such as metapath and network schema,to capture their structural and semantic information.They then apply the message passing mechanism to these structures to generate node vector representations for different downstream tasks.However,these methods mainly focus on structural and semantic information and focus less on feature representation.With the wide application of representation learning technology in various fields,some methods have tried to learn the representation of node features and achieved some results.Many of these methods adopt self-supervised contrastive learning to reduce the dependence on label information by constructing negative sample sets.They also learn functions that map the input feature space to low-dimensional feature spaces while preserving structural and semantic information.This paper proposes a self-supervised representation learning model for heterogeneous graphs that does not rely on negative sample sets.The model uses an improved Autoencoder for pre-training node features,where the encoder and decoder are based on the Transformer architecture.The model extracts the low-dimensional vector representation of the target node by encoding the target node under different types of metapaths.Experiments on several benchmark heterogeneous graph datasets show that the proposed self-supervised heterogeneous graph representation learning model can effectively enhance the performance of various classical and heterogeneous graph neural networks.Furthermore,this paper also proposes an improved graph attention network model that uses the attention mechanism to fuse the feature embeddings under various metapaths generated by the proposed self-supervised representation learning model.The model alleviates the over-smoothing problem through the residual mechanism and the L2 normalization strategy.The proposed self-supervised representation learning model and the improved graph attention network form a heterogeneous graph neural network algorithm model based on self-supervised representation learning.The model achieves excellent results in both supervised and semi-supervised scenarios.The main contributions of this paper are summarized as follows:(1)This paper proposes a self-supervised representation learning model for heterogeneous graphs,and experiments verify the effectiveness of the model;(2)This paper proposes an improved graph attention network,combined with the self-supervised representation learning model,constitutes the heterogeneous graph neural network algorithm model based on self-supervised representation learning.This model has achieved excellent performance on multiple benchmark heterogeneous graph datasets;(3)In this paper,the unified data division standard is used to conduct large-scale experiments on existing related algorithms,to make a fair and unified comparison of existing algorithms on three benchmark datasets.In addition,this paper also conducts applied research on the Protein-Protein Interaction(PPI)dataset to verify the performance of the proposed model under different types of data.
Keywords/Search Tags:Self-supervised Learning, Representation Learning, Heterogeneous Graph Neural Network, Self-attention Mechanism, Feature Pre-training
PDF Full Text Request
Related items