| Internet steps into the Web2.0 with the rapid development of information technology.People’s communication are transforming from traditional interactive device into fast and convenient social media,say,Weibo and Wechat.Users in Wechat are confronted with the real-life interpersonal relation,which leads to some concern when sending moments.While microblogs in Weibo can truly reflect people’s feelings and opinions for they have less restriction in a virtual network.As the most popular app prevailing in college students,Weibo provides an effective path to research the campus life,study status and mentality of college students.So,how to mine abundant sentiment information from Weibo text is a challenge but worthful research.Domestic study in Weibo are focused on sentiment analysis,but most of them restrict the research theme on a confirmed topic.In this thesis,we concentrates on college students’ interesting points and mental problems,combining with statistic theory and machine learning to analysis the sentiment feature from student’s Weibo and build an unusual blog observe system.The methods referred can be apply on college student microblog sentiment research.We obtain students’ text randomly with web crawler from a college official Weibo.In second part we extract key word from Weibo text and find there is not much difference in the words used by undergraduates and postgraduates in their text,but the difference between doctoral students and the first two is quite large.In the next one we bring in an improvement with respect to word segmentation and build a word vector in the help of principal component analysis to calculate the unrecognized compound words which cannot be found in emotion dictionary.The forth part we modelling a scoring card based on logistic regression.As result,the AUC of test set reaches 0.86,which prove the sentiment score has a higher emotional recognition for microblogs.The last part explores the emotional differences of students when posing Weibo with different topics.It is found that the blog post emotion is higher when the content of microblog is related to entertainment and food,and lower when it is related to scientific research and learning.At the same time,we construct an unusual blog post observe system,through which some students with serious psychological problems were found. |