| With the rapid development of the information age,the chatting method has gradually changed from offline to online,making various online social media software spring up,changing the traditional way of communication.As one of the most popular Chinese Internet social media platforms,Weibo is favored by the majority of Internet users for its simplicity,speed,and real-time features.Hundreds of millions of microblog comments are updated on this platform every day,and this huge amount of data contains users’ opinions and rich emotional information,which is important for personal decision making,business strategy adjustment and government opinion guidance.Therefore,mining and analyzing user sentiment tendencies of microblog data has attracted widespread attention from academia and industry.At present,supervised learning-based microblog sentiment classification has achieved good classification results,but it relies on more labeled data.In real life,acquiring a large amount of labeled data requires a lot of labor and time costs,while unlabeled data is easier to obtain.Therefore,this paper chooses a semi-supervised learning algorithm with a small amount of labeled data and a large amount of unlabeled data for microblog sentiment propensity analysis.Meanwhile,collaborative training algorithm is also a hot research topic in semi-supervised learning in recent years.Among them,Tri-training algorithm is a more important and widely used method in collaborative training algorithm.Based on this,this paper uses the semi-supervised Tritraining algorithm to analyze the sentiment tendency of microblogs.In this paper,we analyze the sentiment of Sina Weibo text data,and propose two models based on FSVM and a word two-channel model with improved Tri-training,in order to address the problems such as less labeled data and high labeling cost,text noise and marker noise in the prediction process,and validate the effectiveness of the proposed models in this paper on the new crown epidemic microblog sentiment dataset.The research results are as follows.(1)A Tri-training model based on FSVM is proposedIn order to reduce the impact of text noise and label noise on the classification performance of the model,the model proposes a Tri-training model based on fuzzy support vector machine,starting from the direction of the classifier.The base classifier uses fuzzy C-mean clustering based on support vector machine,introduces an affiliation function to fuzzify all samples,and finally uses the Tri-training framework to train the model.The experimental results show that the Tri-training model based on FSVM can effectively reduce the influence of noisy data on the classification performance of the model and improve the classification accuracy.(2)Propose a word two-channel model with improved Tri-trainingTo address the problem of tagging noise introduced in the process of adding tags to untagged samples in the existing model,the word-training two-channel model is improved from the input side,using Word2 Vec and BERT models to extract sample features in different spaces respectively,and the model can learn the differences between different features of the same samples,which is more conducive to the judgment of the emotional tendency of microblog texts.Simulation experiments based on the same data set further improve the classification performance compared with the FSVM-based Tri-training model.(3)Sentiment classification of microblogs based on the new crown pneumonia epidemicIn this paper,we analyze the sentiment tendency of microblog texts during the epidemic.Based on the public dataset of microblogs provided by the epidemic,the two models proposed in this paper are applied to the pre-processed and lexically labeled microblog comment sentiment texts.The experimental results demonstrate that the classification performance of the two models proposed in this paper is better than several other comparison models,and the word-two-channel model with improved Tritraining achieves the best classification results.The models proposed in this paper can understand the public sentiment tendency and help the government to further improve its public opinion guidance work. |