| Text classification is a supervised learning method that classifies text by labels.With the development of the information age,more and more texts appear in the form of short texts.However,the traditional deep learning methods do not effectively alleviate the semantic sparsity of short texts.Therefore,this paper constructs two short text classification models based on graph attention network,which are Multi-Type Graph Attention Network(MTGAT)and MTGAT-BERT model.Through the two-level graph attention mechanism on heterogeneous graphs,the global information of short texts is captured,so as to effectively alleviate the problem of semantic sparsity caused by the short length of short text.MTGAT model constructs a heterogeneous graph from the text dataset,and establishes the relationship between the original seemingly independent text through the edges of adjacent nodes.In order to mine the implicit semantic information in short texts,the model uses the topic model to extract the topic distribution information from short texts,and adds the topic type node to the heterogeneous graph,so that the edge connection relationship is formed between short texts with the same topic in the heterogeneous graph.The model learns the characteristics of adjacent nodes through spatial graph convolution,and introduces a twolevel attention mechanism to assign different attention weights according to different adjacent nodes,and assign attention weights according to the type of nodes.MTGAT model can output the embedding of nodes containing global information.Therefore,the MTGAT model is fused into the BERT pre-training model,so that the fused model can improve the classification effect.In order to make full use of the global information in the graph in MTGAT and capture the local context information in the short text through the BERT pre-training model,this paper constructs the MTGAT-BERT model,which integrates the graph embedding obtained by MTGAT and the word embedding of BERT pre-training in the embedding layer of BERT.At last,the local context information of short text and the corresponding graph embedding information of short text are fused through the self-attention mechanism.Finally,the features are extracted through different filters of Text CNN model,and the prediction classification of short text is output to complete the classification task.In order to verify the classification effect of graph embedding and short text classification model MTGAT-BERT,this paper makes comparative experiments with the baseline model on different data sets.The experimental results show that the accuracy of MTGAT-BERT model is higher than the baseline model,which proves the effectiveness of this model in short text classification task. |