Font Size: a A A

The Research And Implementation About Weibo Short Text Classification

Posted on:2017-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:X W ZhangFull Text:PDF
GTID:2348330518495972Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The rapid development of information technology and Internet technology make social network more and more popular.The Sina Weibo platform is the most famous one of these social networks.With the expanding of the Weibo user group,a large number of microblogs is produced at every moment.Users can't get useful information effectively due to the vast amounts of microblogs.To classify microblogs may be a good method to solve this problem.Compared with normal texts,microblogs have short length,lack of signals and have sparse features.So to classify microblogs is not an easy task.In order to overcome these drawbacks,a feature extension method based on the Bigram model is proposed in this paper.This method can expand features for microblogs and thus,increase the accuracy of classification algorithm.The researching work in this paper contains the following two parts:1.This paper proposes a feature extension method based on the Bigram model.Using this method,a word transfer model is constructed using bigrams.From the words in the original microblogs,we can get candidate feature words that could be added into original texts.During the feature extension step,noise features may be added into original texts.This paper proposes a noise removal method based on word similarity.This method can eliminate noise features and achieve a better extension quality.This paper designs an experiment plan in detail and uses two datasets to conduct these experiments.The results show that the proposed method can enrich the features of microblogs and the classification accuracy increases a lot.2.Based on users' data on the Sina Weibo Platform,the Weibo Data Analysis System is designed and implemented.This system can get the latest data of Weibo users,and analyze these data in multi-granularity and multi-perspectives.The system implements functions like social network analysis(SNA),focus field analysis of users,keyword extraction,population information statistics and analysis and so on.Thus,this system comprehensively analyzes the online social behavior characteristics of Weibo users.
Keywords/Search Tags:Short Text Classification, Bigram, Feature Extension, Noise Removal
PDF Full Text Request
Related items