| In recent years,micro-blog has attracted a large number of users to participate in it because of its openness,timeliness and richness,which inevitably leads to the explosive growth of the amount of micro-blog data,and creates a serious burden for users to quickly obtain the required information.Therefore,it is necessary to recommend useful information to users quickly and effectively.Among them,the most critical issue is how to explore users' personalized interests,and provide a solid foundation for a good personalized recommendation system.This thesis analyzes and studies the user interest modeling problem,the main research achievement includes:(1)The paper proposes a noise micro-blog filtering method based on the joint classifier and a micro-blog content expand algorithm by using user basic information,comments and forwarding information.It can solve the problem of noise data and short text in micro-blog.(2)The paper proposes an improved LDA model based on word pair co-occurrence for the topic analysis of micro-blog short text,it can overcome the problem of low correlation between entries under the same topic due to the text vector sparsity,which caused by the topic modeling of traditional LDA for short text.(3)The paper is based on the user's historical microblog data on the one hand and the micro-blog contents of their followers or fans on the other hand to obtain user interest features.In addition,use time to describe the acquired topical features as long-term,short-term and past interests,and retain long-term and short-term interests for interest modeling.(4)The paper designs the comparison experiments to verify the effectiveness of the method,the results show that the LDA model based on word pair has a good effect in analyzing the micro-blog short text,and the combined interest analysis can expand the user's interest and recommend richer information for them. |