| With the popularity of the Internet and the development of information technology, thenumber of microblog users grows rapidly, microblog data has an explosive increase. Whenlogging in, users usually face a lot of updates so that they have difficulty to get theirinteresting blog. Microblog filtering technology has already became an important part ofmicroblog service. The microblog filtering technology mainly solve two problems---providemicroblog and related information users have interest in and filter garbage microbloginformation, such as reactionary information pornography, violence, and advertising.Users’ interesting changes as time changes, the traditional batch learning cannot adaptto the update of user interest model. But machine learning based on online learning cansolve these questions. This paper mainly contents these parts as follows:Firstly, I research microblog’s overall framework which includes microblog featureextraction, microblog feature selection, computation of microblog feature weight andfiltering based on machine learning. This paper descriped some machine learning in detail,like logistic regression algorithm, support vector machine algorithm, K-nearest neighboralgorithm and Naive Bayes algorithm, and also analyaed their advantages anddisadvantages.Secondly, I researched microblog filtering technology framework and microblogfiltering. And I focused on microblog filtering based on the online logistic regression modeland the online support vector machine and compared these two metods’ strengths andweaknesse through time complexity and performance of microblog filtering.Thirdly, I research microblog filtering method based on improved online supportvector machine model. Online support vector machine filter outperforms the logisticregression model, but there is a long time to run the shortcomings. the paper by reducing thesize of the training set, reducing the number of training and reducing the number ofiterations are three ways to enhance the online support vector machine filter the control ofthe time spent. Proved through experiments while filtering performance fluctuate slightly,but compared to the advantage of the efficiency can almost be ignored meter, and when the larger amount of data, and the efficiency of the more obvious advantages.Last,microblog filtering has been researched based on feedback learning. Users willhave feedback information when they browse microblogs, like commenting, forwarding andcollecting. We can get the information about users’ interests and then we can classifymicroblogs. With the experimental results, we can know the feedback learning can improvethe performance of microblog filtering. |