Font Size: a A A

Multi-Stream Heterogeneous Graph Convolutional Network And Its Application In Text Classification

Posted on:2024-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:S Y ChenFull Text:PDF
GTID:2557307049452084Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,more and more people are inclined to obtain and publish information on the Internet.At the same time,with the rapid growth of the number of netizens worldwide,the data produced by the Internet also increases dramatically every day.On the one hand,the harm of information overload caused by a large amount of information is seriously affecting people’s daily life.On the other hand,there are a large number of unconfirmed rumors circulating on the social networking platforms of the Internet,which cause great difficulties in people’s work and life.As a common network data,how to classify text data efficiently and detect its reliability has been a hot research direction.The text classification model is widely used as a model to process text data,such as document classification and rumor detection.Recently,the convolution network has been widely used in text classification.However,the current popular text classification model based on the convolution network has some limitations in the use of corpus information,such as ignoring the statistical information in the corpus.From the perspective of how to effectively use the corpus information,this thesis presents two convolution network models of multi-stream isomeric graphs for text classification,which can be applied to document classification and rumor detection,respectively.The main work of this thesis is as follows:(1)A Multi-stream Heterogeneous Graph Convolutional Network based on Representative-Word Documents(MHGCN-RWD)based on representative word documents is presented.Based on the word distribution information in the corpus,this model discovers a representative word that can represent a certain kind of document and adds the representative word to a large document heterogeneous graph in the form of a document,which contains the global semantic information and word distribution information of documents in the corpus.This thesis filters out several sets of representative words from different document ranges and constructs several sets of Representative-Word Documents.With the original documents in the corpus,several Representative-Word Documents Heterogeneous Graph are constructed to train the MHGCN-RWD model.Finally,the MHGCN-RWD model is used to fuse document features from multiple isomers for document classification.In this thesis,the MHGCN-RWD model is applied to five text classification datasets.The experimental results show that the MHGCN-RWD model has reached the optimal classification accuracy on most datasets.(2)A Multi-stream Heterogeneous Graph Convolutional Network based on Tweet-Reaction Documents(MHGCN-TRD)is proposed,which mines the structure information between tweets and responses according to the structural order of reactions,and mining the subject information of tweets according to the similarity between tweets in the corpus,and adding the two information into the training of large document isomers for the model.This thesis first constructs groups of Tweet-Reaction Documents at different levels according to the structure of the reactions,then extracts the theme from the tweet and constructs the Topical-Word Documents.Then,several Tweet-Response Document isomers are constructed with multiple sets of Tweet-Reaction Documents,Topical-Word Documents,and tweet documents,respectively,to train the MHGCN-TRD model.Finally,the MHGCN-TRD model is used to fuse the tweet features of multiple heterogeneous graphs to detect whether the tweet is a rumor.In this thesis,MHGCN-TRD model is applied to the Pheme dataset with subject information.The experimental results show that the detection accuracy of MHGCN-TRD model on the Pheme dataset has exceeded the most popular rumor detection model at present.
Keywords/Search Tags:Multi-stream, Graph Convolutional Network, Feature fusion, Text classification, Rumor detection
PDF Full Text Request
Related items