Font Size: a A A

The Research On Measuring The Sentiment Information In Chinese Opinion Summarization

Posted on:2015-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:M PanFull Text:PDF
GTID:2308330461474939Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Opinion Summarization is not only an important part of Opinion Mining, but also a difficult task. It endeavors to analyze and summarize the sentiment information of obvious tendentiousness. So far it has attracted lots of attention among domestic and overseas researchers. They have made some achievements in this domain. however, their work ignore the effect of sentiment information elements on sentiment information:Firstly, the lack of consideration on the relationship among reviewers, target and opinion expression may reduce the accuracy of opinion summarization; Secondly, previous work ignores the effect of sentiment information elements on sentences similarity, which may affect the diversity of opinion summarization. So in this paper we will focus on the affection of the two points. The details of the research are proposed as follows:(1)In this paper we propose a novel method to deal with the problem of the emotional strength of the sentiment information, which is based on polarity strength to measure. The algorithm uses the principle of PMI (Pointwise Mutual Information), fully considering the relationship among reviewers, target and opinion expression, to measure the mainstream sentiment information. According to the hypothesis that a good abstract should include mainstream sentiment information, get the opinion summarization. Experimental results show that the method based on polarity intensity measure sentiment information than the method which does not take the polarity intensity into account, ROUGE-2 improved by 2.21%, ROUGE-SU4 improved by 2.01%, ROUGE-SU9 improved by 2.45%.(2)In this paper we propose a novel method to solve the problem of diversity in the opinion summarization, which is based on the elements of sentiment information. The algorithm firstly uses spectral clustering to classify the data, lastly establishes a double layer model of sentences and sentences, sentences and classes.The model assumes that a sentence of sentiment information may be related to link sentences and the corresponding class. According to the hypothesis, use the principle of PageRank calculate to get mainstream sentiment information. We calculate the similarity and difference between sentence and sentence by introducing the sentiment elements, and the principle of Euclidean distance, thus, we can get the diverse opinion summarization of products. Experimental results show that the method based on sentiment elements to measure sentiment information is better than the method which does not take sentiment elements into account, ROUGE-2 improved by 3.49%, ROUGE-SU4 improved by 2.97%, ROUGE-SU9 improved by 2.68%.(3)Considering the problems of redundancy in the opinion summarization, we propose an algorithm based on Maximal Marginal Relevance. The algorithm considers both the emotional strength and diversity of the sentiment information to measure the sentiment information of sentence. Then, use the algorithm of Maximal Marginal Relevance to select sentences in order to make the minimum redundancy between sentence and the selected sentences. The experimental results show that using algorithm of Maximal Marginal Relevance to remove redundant information, ROUGE-2 improved by 1.32%, ROUGE-SU4 improved by 1.34%, ROUGE-SU9 improved by 1.38%.To sum up, On the issue of emotional intensity, diversity and redundancy, the paper propose the method based on the intensity of polarity, the method based on emotional elements and Maximal Marginal Relevance algorithm for processing respectively. The methods effectively improve the precision of Chinese opinion summarization.
Keywords/Search Tags:Opinion Summarization, Sentiment Information, Polarity Strength, Diversity, Redundancy
PDF Full Text Request
Related items