Font Size: a A A

Research On Text Sentiment Classification Algorithm Based On Multi-feature Weighting And Hybrid Network

Posted on:2022-09-16Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiFull Text:PDF
GTID:2518306740951719Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The data volume of the texts generated by users every day is extremely large,and digging out people's emotional tendencies from web texts has huge social and economic value.The text sentiment classification method based on deep learning has shown great potential.However,emotion classification methods based on deep learning are currently facing the following problems: 1)The traditional word2 vec word vector distribution expression only reflects the context and semantic information of the words without considering the importance of the vocabulary of the text.The traditional word frequency-inverse The document frequency(term frequency–inverse document frequency,TFIDF)method of weighting word vectors does not consider the difference information of the proportion of words in the category;2)different emotional words in the text have different effects on the emotional tendency of the text.The words trained in word2 vec This kind of emotional difference is not reflected in the vector;3)The vocabulary of different parts of speech has different influence on the emotional tendency of the text,and the word vector trained by word2 vec does not reflect the different influence of the part of speech on the emotion of the text;4)Separately The convolutional neural network and the long-short-term memory network are not enough to extract the global feature information that characterizes the text;5)The cross-entropy loss function used in traditional text classification is not accurate enough to classify difficult samples in the data set.Aiming at the above problems,two types of text sentiment classification models based on deep learning are designed.The main work of this thesis is as follows:1)A word vector strategy combining category ratio and sentiment weighting is designed,and the weighted word2 vec word vector is extracted and sentimentally classified through a dual-channel Bi LSTM network.CT value is based on calculating the TFIDF value of words and introduces the between-class and out-of-class scale factors CR(Category Ratio)for improvement,which not only considers the proportion of words in the text,but also reflects the difference in the proportion of words in the category;In view of the different effects of different words in the text on the emotional tendency of the text,this article combines the emotional dictionary to assign different emotional weights to words of different natures,and uses the emotional weights to weight the word vectors corresponding to the words,thus reflecting the emotions of the words in the text.Tendency to express differences.In the COAE data set experiment,it is proved that the category is compared with the weighted word2 vec word vector.Compared with the original word2 vec word vector,the accuracy is improved by 2.2%,and the emotion-weighted word2 vec word vector is 5% higher than the original word2 vec word vector.Combining the two Compared with the benchmark model,the constructed sentiment classification network improves Accuracy by 7.04%.2)Constructed an emotional part-of-speech weighted mixed network emotion classification model PEWBi LSTM-CNN-Att-FL.This model uses emotional part-of-speech weighted word2 vec word vectors as input,which reflects the different effects of words of different parts of speech on the sentiment tendency of the text.Make full use of the Bi LSTM network's ability to extract contextual information and convolutional neural networks to extract local key information.At the same time,use the attention mechanism to give different attention weights to words with different semantic effects,and combine the characteristics of the three to build Bi LSTM-CNN-Att-FL sentiment classification model.In order to improve the classification of difficult samples and improve the overall accuracy of emotion classification,the focal-loss loss function is introduced to strengthen the learning of difficult samples.In the NLPCC data set experiment,it is proved that the proposed PEWBi LSTM-CNN-Att-FL model improves the Accuracy by 2.76% compared with the benchmark model,and increases the Accuracy by 3.43% in the multi-classification experiment.
Keywords/Search Tags:Word2Vec, Attention Mechanism, Sentiment Classification, BILSTM, Deep Learning
PDF Full Text Request
Related items