Studies On Clustering Analysis And Visualization For Public Opinions Of Discrete Text About Certain Topic

Posted on:2012-01-06

Degree:Master

Type:Thesis

Country:China

Candidate:Y Shen

Full Text:PDF

GTID:2178330335952713

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Web 2.0, more and more netizens are accustomed to publishing opinions on network carriers like BBS, Blog, etc. The discrete texts with scattered storage and different views constitute an all-encompassing public web opinion. The qualitative and quantitative analysis for sentiment polarity in discrete texts is an important way to know public network opinions and netizens'attitude towards things or events. On that basis, the clustering analysis on time-varying public web opinions and the visualization of results can represent the tendency of public opinions vividly. That is a hot issue with common concern in many fields.In summary, the thesis accomplishes the goal of public opinion analysis with sentiment polarity as the clue and opinion mining as the strategy, according to clustering analysis.The study on opinion mining of Chinese texts starts late, and much fundamental work is still in progress. The research of analysis on public opinion in network discrete text is just on the initial stage. The thesis focuses on characteristics of discrete text to clustering analysis on public opinions.The thesis studies on the titles and snippets in blog texts. Blog texts imply rich sentiment with scattering distributed polarity. Therefore, it's difficult to obtain the key semantics or centralized concepts in blog texts. However, titles and snippets contain relatively less sentiment words and express concentrated concept. Thus, selecting titles and snippets of blog texts as the ultimate research object is an important measure to accelerate clustering convergence.The experiment in thesis is consists of clustering analysis on blog text public opinion and the evaluation for clustering results. The clustering analysis for blog text public opinion comprises two parts, one as clustering analysis model based on concept of public opinion, the other as visualization of clustering results. The thesis improves traditional vector space model (VSM) with introducing the concept of words and uses concept-based VSM to represent blog texts (titles and snippets) to upgrade the precision of text representation. Blog texts are respectively represented by term-based VSM and concept-based VSM with clustering analysis using k-means algorithm. Finally, the clustering result is visualized and evaluated. The traditional VSM is a comparison group to evaluate the performance of concept-based clustering analysis on public opinions. The evaluation model of clustering results is Ground Truth with three common metrics, which are Precision, Entropy and Rand Index.The experiments show that concept-based VSM has better performance than traditional term-based VSM in public opinion clustering of the discrete texts.

Keywords/Search Tags:

discrete text, cluster, public opinion analysis, sentiment polarity, clustering visualization

PDF Full Text Request

Related items

1	Studies On Clustering Blog Text Based On Certain Topic And Sentiment Polarity
2	Research On Sentiment Orientation Clustering For Chinese Text Comment
3	Research And Implementation Of Key Technologies In Sentiment Analysis Of Weibo Topics
4	Research On The Techniques Of Online Public Opinion Monitoring Based On Topics
5	Based On The Text Orientation Of Shallow Semantic Analysis
6	Public Opinion Extraction Method Based On Text Emotional Computing Research
7	Study Of The American National Image Construction In China's Public Opinion Filed Based On Sentiment Analysis
8	Research On Text Clustering Of Micro-blog Public Opinion: Word Sense Cluster And Collocation-Based Method
9	Research On The Sentiment Analysis Method Of Online Public Opinion Based On Picture Fuzzy Set
10	Application Of Text Sentiment Analysis In The Agricultural Network Public Opinion System