Font Size: a A A

Validity And Fine-grained Sentiment Analysis Of Online Reviews Based On Deep Learnin

Posted on:2024-08-22Degree:MasterType:Thesis
Country:ChinaCandidate:S B WangFull Text:PDF
GTID:2568307130955819Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the popularity and development of the Internet,more and more people choose to express their opinions and attitudes in the form of texts on the Internet,and the huge amount of information contained therein is of great analytical and commercial value.However,it is difficult to dig out the attitudes of different aspects of things contained in the text by analyzing only the overall sentiment of the sample,and it is easy to cover up the specific advantages and disadvantages of the goods,and more seriously,when the polarity of a certain granularity of sentiment is opposite to the overall polarity,the sentiment of this granularity cannot be reflected by the overall sentiment analysis,and a lot of useful information is lost.Not only that,due to its specificity,there are a large number of unstructured text,false information and irrelevant information in the online review text,and these bad samples will seriously interfere with the effect of fine-grained sentiment analysis,and may even reach completely opposite conclusions,so it is necessary to conduct validity screening of the data before doing fine-grained sentiment analysis.Based on this,the main tasks of the text are:1)propose a validity filtering rule for online review texts.Obtain a set of sieve feature words using the noun information of the data itself and consulting information,etc.,and automatically update them after construction,and use the set of sieve feature words for validity screening of the text to restore the most realistic situation of the dataset to the maximum extent;2)Two data enhancement methods using permutation and irrelevant data transformation specifically for fine-grained sentiment analysis are proposed to effectively alleviate the sample imbalance problem;a semi-supervised model using a hybrid loss function is also proposed.The data is divided into unlabeled data using cross-entropy loss and labeled data using focus loss,and the model is trained by calculating the loss function through a mixture of the two,allowing the model to learn more data knowledge to solve the sample imbalance problem and improve the classification effect of the model;3)Analyze a real case using the above validity screening rules and semi-supervised BERT(semi-BERT).From data acquisition to validity filtering,to training of the finegrained sentiment analysis model and presentation of the results,and finally using this model to supervise the fine-grained sentiment of comments in the future period.
Keywords/Search Tags:Comment validity, Semi-supervised learning, Data enhancement, BERT
PDF Full Text Request
Related items