Font Size: a A A

The Research Of Microblog Rumor Identification Based On Deep Learning

Posted on:2022-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:M J GaoFull Text:PDF
GTID:2517306518451484Subject:Statistics
Abstract/Summary:PDF Full Text Request
Sina Weibo is currently one of the most widely used social network platforms and message dissemination bases in China,with massive amount of information and efficient information dissemination speed,which has great influence in social networks,and these features provide an important medium for the rise and outbreak of online public opinion.Therefore,the governance of Internet rumors can not be delayed,and the automated detection of Internet rumors has become a hot topic in the research fields of natural language processing and text mining,which has attracted extensive attention from experts and scholars at home and abroad,and the research of these problems will provide new ideas and methods for the detection,early warning,monitoring and governance of Internet rumors.Therefore,this paper investigates rumors on Sina Weibo and establishes a deep learning model to realize automatic identification of microblog rumors and reduce the harm caused by the spread of microblog rumors.In this paper,we use the rumor texts published by Sina Weibo community management center and the randomly crawled non-rumor microblog texts from Weibo Square as the data set,and firstly,we perform pre-processing steps such as data cleaning,Chinese word separation and removal of deactivated words on the text data.The BERT word vector model is used to transform the pre-processed text data into a computerrecognizable n*768-dimensional vector,and the first 100-dimensional feature values are retained by using the principal component dimensionality reduction analysis method.The vectorized data are then clustered using the minibach-kmeans clustering algorithm.After the clustering,LDA topic models are built for different categories of microblog rumors,and the rumor topics of different categories are deeply mined and the topic words of each category are extracted,and category labels are defined for the rumors according to the category topic words.In order to improve the classification effect of the rumor recognition model,we divide the microblog texts into different categories according to the rumor labels extracted above,and build LSTM recurrent neural network rumor recognition models for different categories of microblog texts respectively to deeply learn the semantic information in the microblog texts and comments.In addition,for text and time-seriesbased comments,this paper also introduces the attention mechanism,which expands the comments according to time nodes and assigns different weights to each node based on its different degree of information influence.Considering the different lengths of microblog comments and the excessive challenge of time series,we first chunk the microblog comments and consider each piece of content as an input at one time node.the introduction of the Attention mechanism improves the model effect and operation efficiency,which is superior to the separate LSTM neural network model.The following conclusions were obtained from the above study: 1.According to the textual characteristics of microblog rumor data,microblog rumors can be classified into three categories: social rumors,entertainment rumors and health rumors;2.The effect of the rumor recognition model after classification is significantly better than that of the recognition model before classification.The model improves the accuracy by2%-3%;3.Applying the microblog rumor identification model to the epidemic rumors that have not been publicized by the community management center,the results prove that the model can identify these rumors,which proves the effectiveness and practicality of the model,which is important for monitoring and managing the microblog rumor work.
Keywords/Search Tags:rumor detection, Sina Weibo, text clustering, deep learning
PDF Full Text Request
Related items