Font Size: a A A

Research On Bad Microblog Text Classification Based On Deep Learning

Posted on:2022-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:M X WuFull Text:PDF
GTID:2568306836977959Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,deep learning has been applied in the field of natural language processing with its powerful feature extraction and target classification capabilities.However,under the supervision system,there are still some negative information and comments that polluted the network environment.Therefore,this paper carries out an in-depth study of the bad text on the web based on deep learning.Firstly,in order to solve the problem of text information loss caused by insufficient text feature extraction,a new method based on feature fusion for bad text classification is proposed.The combination of word features extracted by N-gram and semantic features extracted by Word2 vec based BILSTM can better extract text information and avoid the loss of text information.The experimental results show that the proposed method based on feature fusion has better performance and effect on bad text classification.Secondly,in view of the problem that word segmentation errors will greatly affect the semantic learning ability of the circular neural network and that the features extracted from the convolutional neural network model are not enough to distinguish the bad degree,a BertBilstm model is proposed to analyze the bad degree of the bad text in the web.BERT is used to extract the character feature vector from the bad web text,and Bi LSTM model is used to extract the semantic features of the text,then Softmax is used to classify the bad text.Experimental results show that compared with traditional machine learning methods and deep learning methods to extract word vectors,this method has better performance in analyzing the degree of bad web text.Finally,the experimental results of the two models in the task of bad text classification and bad text degree analysis are compared.In the task of bad text classification,the number of texts that need to be classified is large,and the feature fusion BILSTM model can achieve better classification effect in a shorter training time.For the classification task of the bad degree of bad web text,the text data volume is small,and the Bert-Bilstm model has great advantages in accuracy and time consumption.
Keywords/Search Tags:Bad text, deep learning, feature fusion, BiLSTM, BERT model
PDF Full Text Request
Related items