Font Size: a A A

Microblog Popularity Prediction Based On Active Learning

Posted on:2020-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:M T XuFull Text:PDF
GTID:2428330575962059Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The emergence of microblog has promoted the development of social networks.It has a large number of users.These users share information through microblog and communicate with other users,making it one of the important ways of information communicating and sharing.A large amount of information appears on the microblogging platform every day.Although it is convenient to people for communicating and sharing of messages through microblog,it also has brought many challenges.Therefore,the popularity prediction of social networks such as microblog has also attracted the attention of scholars.It is significant to predict microblog popularity in a timely and accurate manner for personalized message recommendation,breaking news detection,public opinion analysis and so on.Firstly,according to the traditional active learning method based on SVM,the sample points closest to the distance hyperplane are selected,and only the uncertainty of the samples is considered,which leads to the redundancy and outliers of the unlabeled sample set.In view of these two problems,the novel active learning algorithm based on SVM is proposed.The algorithm not only takes into account the uncertainty of those unlabeled samples,but also considers diversity and representativeness,The experiments show that the proposed algorithm has advantages in performance of convergence speed,data annotation and accuracy curve stability.Secondly,in view of the related factors that affect the microblog popularity prediction in the past,some of the related features and time feature of retweet structure within one hour of microblog release are ignored.The proportion of the weak relationship users,average depth,Wiener index,Randi ?c index and time features in the structure diagram based on one-hour retweet users are extracted.The features are used to analyze the impact on the microblog popularity prediction.Through the experiment of microblog dataset show that the predictive performance of microblog popularity is improved effectively by considering these features synthetically.Finally,in view of the traditional microblog popularity prediction based on machine learning algorithm depends on the number of labeled data as a training set.The practical applications often get a lot of high cost of labeling data,but it is easy to obtain a large amount of unlabeled data.In view of this problem,the method of microblog popularity prediction based on active learning is proposed.Under the premise of a small amount of annotated data and a large amount of unlabeled data,the user features of the publisher,the features of the microblog content,the retweet structure features within one hour,and the time features are combined.In addition,the model of active learning based on SVM to predict the popularity of microblog is trained.The experiments show that the method of microblog popularity prediction based on active learning not only reduces the cost of labeling,but also improves the prediction effect of the model and shows good performance.
Keywords/Search Tags:Microblog, Popularity Prediction, Active Learning, Unlabeled Samples, SVM
PDF Full Text Request
Related items