| With the rapid development of the Internet and arrival of Web 2.0 era,social network platforms such as Weibo and WeChat have become increasingly important in people’s life.The number of blog articles on the internet shows an explosive growth,massive information of which contains a lot of users’ emotions and viewpoints,which can provide data support for government agencies’ public opinion analysis,enterprises’ market decision-making and consumer behavior analysis.Therefore,conducting sentiment analysis of blog texts to discover their large potential value is of great commercial and social significance.The main task of blog text-based sentiment analysis is to judge whether the sentiment tendency of a blog text is positive or negative.The main job of the analysis based on the micro-blog is to determine the emotional orientation of micro blog.At first,this paper builds the microblogging sentiment analysis with emotional dictionary,basic emotional lexicon part adopts the existing Chinese emotional NTUSD and HowNet dictionary both comprehensive,then by collating weibo corpus emoticons,such as new words and the network popular words using a PMI and word2 vec comprehensive approach to the expansion of emotional dictionary,get emotional dictionary.Then obtain micro blog this corpus data from the Internet and the sixth Chinese orientation analysis evaluation(COAE2014)evaluation corpus data and carries on the segmentation,stop words processing,such as pretreatment,after using artificial tagging and emotional tendency existing annotated corpora text combination of build the initial training set.It is expounded in this article a comprehensive text emotion classification method,this method overcomes the classification method based on emotional dictionary over-reliance on emotional dictionary,problem for the unknown word processing capabilities,as well as the classification method based on machine learning is to construct the eigenvector lost the degree adverbs in the text of the emotional analysis related elements such as relationship between sentence and sentence patterns in faults.This classification method in the construction of training in the process of feature vector space in traditional support vector machine(SVM)classification method discarded information shall be maintained,into the feature vector,because of its use based on machine learning algorithms in the classification phase,to a certain extent,retained the emotional processing capacity in the end,the unknown words in a dictionary by micro blog this corpus of this classification method is verified,the results show that compared with the traditional sentiment analysis method shows higher accuracy. |