Font Size: a A A

Sentiment Analysis And Application Of Chinese Microblog Based On Machine Learning

Posted on:2018-10-11Degree:MasterType:Thesis
Country:ChinaCandidate:J D ZhangFull Text:PDF
GTID:2348330518495303Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development and popularization of mobile internet and social networking services in the 21st century, microblog is increasingly penetrating into life of many netizens. These text messages have enormous commercial and social values. Hence, sentiment analysis technology for people to acquire and mine a great number of sentiment messages has emerged at the right moment and has rapidly become a research hotspot in domestic and overseas natural language processing field.This paper uses Sina Microblog as the research object. First of all, based on the fusion of traditional microblog features, in view of the shortcomings of the existing basic microblog features, this paper proposes a new feature of the emotional classification. In view of the lack of full emotional word level semantic problems, this paper proposes a feature extraction method based on semantic rules and sentiment dictionary. Due to ignore the semantic and structural analysis of the sentence level, this paper defines the features of the embedded sentence from the sentence level granularity by analyzing the number and distribution characteristics of microblog sentences. For the lack of consideration of lack of semantic information in complex context, this paper do the feature extraction of complex sentence semantic relations between sentences and complete features, combined with features of the embedded sentence and the "language technology platform (Language Technology Platform, LTP)"developed by Harbin Institute of Technology to analyze the complex emotional changes. Finally, the semantic features of complex sentences are defined for improving scheme of feature classification.Secondly, through the research on the importance of the microblog emoticons, the limitations of the default expression library in the feature extraction and the efficiency of the artificial construction of expression resources, this paper, based on the co-occurrence of the context of the emoticon,the paper uses statistical method to construct mapping table and emoticon lexicon self-creation algorithm is designed combining the semantic sentiment analysis of the text and the resource of the default emoticon lexicon. Based on a large scale of micro-blog data, the emoticon lexicon self-creation experiment was designed, including symbolic noise filtering analysis experiment and threshold selection experiment. The experimental results verify the feasibility and effectiveness of the proposed algorithm.Finally, this paper combined with the Support Vector Machine, improved classification of emotion and the emoticon lexicon self-creation algorithm,aiming at sentiment classification problem of microblog, this paper proposes a SVM based microblog sentiment classification model. Through the five-fold cross validation experiment, first, this paper do the comparison and analysis of the experimental results based on the semantic rules and the sentimental knowledge dictionary, the classification of the SVM and the basic microblog and the classification method based on the semantic features of the complex sentence. The results verify the effectiveness of the proposed feature scheme.Based on this, the default emoticon lexicon and emoticon lexicon which is createdfrom 15/75/500 million microblogs are applied to extract the feature of the emoticons. And once again, the SVM based microblog sentiment classification experiments are compared and analyzed, the experimental results verify that the emoticon lexicon self-creation for microblog emotion classification has a certain role in promoting results. Finally, the microblog sentiment classification model based on SVM proposed in this paper is optimized.
Keywords/Search Tags:sentiment analysis, support vector machine (svm), complex sentence pattern, emoticon lexicon self-creation
PDF Full Text Request
Related items