| Sentiment analysis aims to explore the emotional tendency of text,which is one of the important tasks of natural language processing Sentiment analysis has been widely used in both academia and industry.And it can be divided into coarse-grained sentiment classification and fine-grained sentiment classification.The goal of coarse-grained sentiment classification is to identify the sentimental polarity of comments.The goal of fine-grained sentiment classification is to extract aspect word and identify its sentimental polarity.They can extract valuable information from Internet comments,which is of great significance to consumers,businesses and the government.Text sentiment analysis needs a lot of annotation data,and a small amount of data can only train an over fitting model,but at the same time,annotation data need to invest a lot of manpower.Data enhancement is a way to solve the above problems.This paper studies the data augmentation for sentiment analysis task.Specifically,it includes the following three aspects:1.For the coarse-grained sentiment classification task,there is the problem of imbalanced sample,which often has more positive comments and less negative comments,and there is the problem of spurious association as well.In order to solve the above problems,we propose the idea of antonymous sample generation to solve the problems of sample imbalance and spurious association,given an original sample,the antonym substitution method is used to obtain the sample of opposite labels,and reinforcement learning is used to further improve the quality of antonymous samples.The experimental results show that the dual antonymous sample generation and dual sentiment classification framework based on reinforcement learning can effectively improve the accuracy of coarse-grained text classification and the robustness of the classification model.2.For the task of fine-grained sentiment classification,annotation requires more human and material resources than coarse-grained text classification.Aiming at this problem,we propose a data augmentation method based on language model and word substitution.Our proposed method can effectively improve the robustness of fine-grained classification text classifier.3.For the problem of fine-grained sentiment classification in domain adaptation,we focus on adapting to unsupervised domain.We continue the second work,in order to solve the semantic irregularity caused by word replacement,we further use the language model to optimize,use the language model to generate the sample of the target domain and generate the annotation of the sample at the same time.Experimental results show that this method can effectively improve the performance of fine-grained sentiment classification model. |