Research On Overlapping Community Detection Based On GNN In Heterogeneous Networks

Posted on:2023-01-17

Degree:Master

Type:Thesis

Country:China

Candidate:Y Sun

Full Text:PDF

GTID:2530306845458274

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Community discovery is to divide the nodes in the network into multiple communities through a certain division method.In the early stage of research,researchers abstracted the real network as a homogeneous information network,that is,nodes and edges are the same type of network,and proposed many traditional homogeneous information network community discovery methods.However,most networks in the real world are heterogeneous information networks,that is,nodes and edges are various types of networks.The community discovery method of homogeneous information network is suitable for heterogeneous information network,but the accuracy of community division is low.The researchers found that most of the communities in the real world are overlapping communities,so overlapping community discovery in heterogeneous information networks can not only utilize rich semantic information,but also make community discovery results more realistic.Most of the traditional community discovery methods only divide the community according to the structure information of the node or only use the attribute information of the node.The researchers found that Graph Neural Networks(GNN)can combine the structural information of the network with the node attribute information,and learn at the same time.The traditional graph neural network performs better in finding problems in the homogeneous information network community,but cannot use the characteristics of different node types and edge types in heterogeneous information networks cannot make good use of the semantic information of heterogeneous networks.Based on the graph neural network and the characteristics of the heterogeneous information network,this thesis proposes an overlapping community discovery method based on the heterogeneous graph attention network to solve the overlapping community discovery problem of the heterogeneous information network.The contributions are as follows:1.In order to fully combine the structure information and attribute information of the heterogeneous information network,construct the heterogeneous network feature representation.The traditional graph neural network is only suitable for the homogeneous information network,which combines the structural information and attribute information of the homogeneous information network nodes to perform feature representation learning,and uses the generated low-dimensional feature space representation for downstream data analysis.In order to take advantage of the characteristics of different types of nodes in the heterogeneous information network,the structure information of the nodes in the heterogeneous information network is firstly represented by a node matrix according to the specified meta-path,and then the structural information of the different meta-paths and the attribute information are combined to construct the heterogeneous information network’s node feature representation.2.In order to fully mine heterogeneous network information,an improved heterogeneous graph attention network is used to extract node features.The heterogeneous graph attention network combines the graph neural network with the attention mechanism in the heterogeneous information network,obtains the weight information of the neighbor nodes based on the metapath through the node-level attention mechanism,and obtains the weight information of different meta-paths through the semantic-level attention mechanism,and fuse all weight information to obtain a new node feature vector,which fully exploits different semantic information in heterogeneous networks.This thesis improves the activation function of the semantic-level attention mechanism in the heterogeneous graph attention network,solves the problem of gradient disappearance,and learns the node feature vector together with the subsequently generated community membership matrix.3.For overlapping community discovery,the heterogeneous graph attention network is combined with the graph convolutional neural network and the loss is unified based on the B-P model.The node feature vector generated by the heterogeneous graph attention network is used to generate the community membership matrix through the graph convolutional neural network,and the negative log-likelihood function of the Bernoulli-Poisson model is used as the loss function to uniformly optimize the node feature vector and community overlap degree,so that the B-P model can be used for the discovery of overlapping communities in heterogeneous information networks,and the final community division result can be obtained through the threshold of community division.This thesis selects two real heterogeneous information network datasets DBLP and IMDB,and compares them with the traditional community discovery algorithm SLPA and other algorithms based on graph neural networks,graph convolutional neural networks,graph attention networks,heterogeneous graph attention networks,The NOCD algorithm conducts comparative experimental analysis,and uses the improved extended modularity EQ* value as a measure of the effect of finding overlapping communities in heterogeneous information networks.The experimental results show that the model proposed in this thesis has a certain degree of improvement compared with the traditional community discovery algorithm and the algorithm based on graph neural network.And by analyzing the meta-path weight information obtained by the final training,it can be seen that the meta-path weight information obtained based on the improved heterogeneous graph attention network conforms to the understanding of semantic information in the real world.

Keywords/Search Tags:

Heterogeneous graph attention network, Overlapping community discovery, Meta-path, Graph convolutional neural network, Bernoulli-poisson model

PDF Full Text Request

Related items

1	Research On Heterogeneous Networks Overlapping Community Discovery Algorithm Based On Network Embedding
2	Fraud Detection On GitHub Using Heterogeneous Graph Neural Network
3	Research On Overlapping Community Discovery Method Based On Deep Clustering Fusion
4	Research On Overlapping Community Detection Algorithms Based On Cluster Ensemble And Graph Neural Network
5	Overlapping Community Detection Based On Deepwalk And Graph Neural Network
6	Research On Heterogeneous Graph Representation Learning Algorithm Based On Meta-path
7	Community Detection Based On Graph Neural Networks
8	Research On Learning Of Complex Features For Prediction Of LncRNA-Disease Associations
9	Research On Node Classification Based On Graph Convolutional Networks In Heterogeneous Graphs
10	Research On Heterogeneous Network Representation Learning Based On Graph Attention Mechanism