| With the development of the Internet, more and more users are willing to send microblogs to express their concerns and views to public events and their own feelings or emotions, to report news events, to give their opinions, and etc. Microblog has become a valuable source of data. How to effectively acquire information from micr-blogg data and improve the efficiency of access to information, has been a hot topic for a long time. This paper analyzed the structural characteristics of microblogs and builded a data preprocessing and topic detection and tracking system for microblog. Weibo Open platform was used to get data. After word segmentation and feature selection, vector space model(VSM) using words selected was constructed. In the system, Microblog hashtag has been used to enhance the weight of feature words related to topic, and the microblog forwarding relationship was used to improve the accuracy of clustering. Moreover information such as the forwarding number and the comment number of a microblog as well as information of the user who posted the blog were used to extract topic keywords of the clusters. According to the forwarding relationship between microblogs, this paper proposed an adaptive topic tracking algorithm to tarck the development of microblog event.(1)Since many microblogs have hashtags and a hashtag is usually a summarization of the whole microblog text, this paper proposed feature weight calculation method using hashtag that can effectively improve the effect of microblog clustering.(2) Since the theme and content of the microblogs are simliar and the forwarded microblog, a clustering algorithm using the forwarding clusters was proposed, where the relation matrix was firstly constructured based on the forwarding relationship between micro-blogs, then the forward clusters were created and used into clustering analysis.(3) For each blog in amicroblog cluster topic keywords of the clusters were extracted using information such as the forwarding number and the comment number of a micro blog and information of the user who posted the blog.(4) According to the forwarding relationship between micro-blogs, this paper proposed an adaptive topic tracking algorithm to tarck the development of micro blog event. |