| With the rapid development of the Internet,the emergence of e-commerce platforms and social networking platforms,people’s lifestyle has undergone great changes.More and more people like sharing their opinions about some hot events on Weibo and other social networking platforms.At the same time,more and more people also like commenting goods that they have bought on e-commerce platforms such as Tokyo.The enormous web commentary data generated by numerous Internet users has great social and commercial value.These web comments contain the views and sentiments of Internet users.The sentiment analysis of these text data can provide a basis for the government to formulate relevant policies and help companies understand the user’s emotional attitude towards a product or service.At present,the technology of sentiment analysis for English texts is relatively mature,but there is still much research space for sentiment analysis of Chinese texts.This thesis focuses on the sentiment analysis of Chinese web comments.At the earliest time,sentiment lexicons were used to conduct sentiment classification studies.Nowadays,machine learning can be used to conduct sentiment classification studies.This thesis makes relevant improvements on the basis of these two methods,and then conduct sentiment classification on network comments:(1)Conducting sentiment classification for network comments based on sentiment lexicons is mainly judging the sentiment of the text through sentiment words.The core of this method is the construction of sentiment lexicons,The completeness of the sentiment lexicons directly affects the accuracy of the sentiment classification.The traditional method uses the SO-PMI algorithm to expand the sentiment lexicons,but when the data is sparse,it will cause misjudgment.Expanding sentiment dictionary through the combination of SO-PMI algorithm and Word2 vec,the method takes into account the semantic information of the vocabulary and the emotion classification is better.(2)Conducting sentiment classification for network comments based on machine learning,the corpus is usually divided into training sets and test sets,then conducting text pretreatment,feature selection,feature representation,and classifier training.Finally,the sentiment analysis of the text is carried out by the classifier.At present,themainstream feature representation is Word2 vec,but the word order information is not considered,while fasttext joining N-gram can effectively solve the problem of word order.Therefore,this thesis proposes a combination of fasttext algorithm and classifier for sentiment classification,which is more advantageous than the traditional text sentiment classification method.(3)The sentiment classification method based on sentiment lexicon has a higher accuracy for texts with more obvious emotional tendencies,but it has poorer effect on the texts with fuzzy sentiments.In view of the above problems,this thesis proposes a method of combining sentiment lexicon with machine learning to conduct sentiment classification for text,the final emotional classification effect is better than the traditional method.(4)Based on the theoretical analysis of textual sentiment classification,this thesis carries out research on the application of related visualization,designing a sentiment analysis system,then implementing and testing it. |