Font Size: a A A

The Study Of Semi-supervised Classification Of Imbalanced Graph

Posted on:2021-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:M ZhuFull Text:PDF
GTID:2370330620468141Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph,a data structure,which describes the relationship between things,is often used to represent complex data relationships.The node classification tasks based on graph structure are applied in many fields such as social,political,and biological engineering.In practical applications,in order to make full use of unlabeled data and maintain the integrity of graph structure,the node classification task of graphs usually takes the form of semi-supervised classification.However,due to the influence of data sampling bias and other objective reasons,the distribution of labeled data is usually imbalanced among categories.The existing semi-supervised classification methods of graphs have poor adaptability on imbalanced data sets.Most graph neural network methods do not consider the problem of imbalance at all.In the process of aggregating feature information,the importance of labeling data of the majority class and the minority class is not distinguished,which makes it easy for samples belonging to the minority class to be misclassified as the majority class.In addition,some deep graph neural networks may have the problem that the features of nodes belonging to different categories tend to be consistent during the process of information aggregation,which leads to a poor classification performance on imbalanced data sets.In view of the problems that the graph neural network methods can not distinguish the importance between the majority class and the minority class during the process of aggregating feature information,this paper proposes an aggregation-scale adaptive graph neural network method ASAGNN.This method designs different aggregation scales for different types of nodes which is based on the relative position of nodes and labeled nodes on the graph structure,so that the feature information of the labeled samples belonging to the minority classes has more opportunities to be used by its distant neighbor nodes.Therefore,the possibility that the samples belonging to theminority class is misclassified into the majority class may be reduced.In view of the problem of indistinguishable features caused by some deep graph neural network methods,this paper proposes the ClusteringGCN method.This method clusters similar nodes of the graph into subgraphs based on the original features of the nodes,and applies a graph convolution layer of the GCN model into the subgraphs and the original graph respectively.The ClusteringGCN method enhances the role of the original features of the nodes in the classification process,ensures the distinguishability of the aggregated node features,and achieves the purpose of slowing down excessive smoothing.Additionally,this paper attempts to apply the commonly used methods for dealing with unbalanced data sets into the semi-supervised classification problem of imbalanced graphs.We proposes a graph oversampling method,which optimizes the structure of the graph by generating new sample nodes for minority classes.We try to use the cost-sensitive techniques to improve the semi-supervised classification model of the graph.In order to verify the effectiveness of the above-mentioned methods,this paper conducts experiments on Cora,Citeseer and Pubmed benchmark datasets.The experimental results show that the performance of ASAGNN method and ClusteringGCN method is superior to the performance of common methods.The application of graph oversampling and cost-sensitive techniques can effectively improve the performance of the semi-supervised classification model on the imbalanced graphs.The ASAGNN method combined with the cost-sensitive techniques especially outperforms the other methods on most imbalanced data sets.
Keywords/Search Tags:Semi-supervised Classification, Imbalanced Classification, Graph Neural Network, Graph Representation Learning, Deep Learning
PDF Full Text Request
Related items