Font Size: a A A

Research On Key Techniques Of Chinese Micro-blog Sentiment Analysis

Posted on:2014-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:J H LinFull Text:PDF
GTID:2255330422455882Subject:Business management
Abstract/Summary:PDF Full Text Request
The Since the introduction of Micro-blog, it has become increasingly popular.Research on the analysis of Micro-blog sentiment benefits the management ofgovernment and enterprise. Micro-blog requires higher demands on natural languageprocessing but related studies are in initial stage and many issues still need to beprobed into. Therefore, this research has theoretical and practical values.This article introduced the key technology of Chinese Micro-blog sentimentalanalysis, particularly the sentimental lexicon construction, the sentimental featurechoice and the classifiers etc.The basic, emoticon and cyber speak sentimental lexicon were constructed.Based on different features of lexicons, this study proposed several constructionmethods of sentimental lexicon and applied them into the sentimental analysis. Theexperiment shows that the micro-average of (SO-A) reaches78.61%and that of(SO-P)70.76%. However, in the mixed corpus context, the micro-average of (SO-A)reaches79.88%and that of (SO-P)71.75%. This indicates that the constructedsentimental lexicons are valid in the choice of sentimental word, the judgment ofsentimental polarity and the counting of weight. Besides, the sentimental lexiconshave advantages of good classification, simplified process and stable function.This present study also introduced the sentimental analysis of Na ve Bayes (NB).With the short-text feature of Micro-blog, the experiment compared single opinionand separated opinions, studied CHI, sentimental lexicon and twice sentimentfeature extraction, and choice TF, BOOL and TF-IDF to count weight. The findingsuggests that in the context of single opinion, the micro-average (F1) of NB reaches75.69%while in that of separated opinions, F1=78.63%. This indicates thatsentimental classification has a better effect by separating opinions. Besides, theoptimal pre-processing is "separated opinions+twice sentiment feature extraction+BOOL", whose micro-average reaches78.63%.Moreover, with the corpus which mixed Micro-blog corpus and productcomments, this paper also explored the sentimental analysis of massive web texts.The results reveal that the classification function of sentimental lexicon (F1=79.88%)is better than that of NB (F1=67.8%) and the former is simple, rapid and stable.
Keywords/Search Tags:Chinese Micro-blog, Sentimental analysis, Sentimental lexicon, Na ve Bayes
PDF Full Text Request
Related items