| Microblog has become the information platform for people to share and disseminateimmediately, which features is everyone can publish and share information anytime andanywhere. In the era of information explosion, a heat issue is raised: how to inferring user’slocation from scattered and diverse tweets. In order to improve the granularity of differentgeographical location inference accuracy and solve the position information sparsity, on thebasis of the acquired location-related information combining with the domestic and foreignexisting location inference technology, this paper studied the location inference techniquesbased on microblog platform.Firstly, to achieve location inference in district level and street level, location inferencemethod based on language model is presented. Through the analysis of geographicinformation features on microblog platform, modified the local word method to get a locationlanguage based model. District level and street level experiments were carried out respectively.The experimental results show that under unigram and bigram language model the f-measureis0.32and0.34. The recall rate of district level and street level is24.9%and16.36%. Also itindicates that the location inference accuracy and recall rate need to be improved, especiallythe sparsity problem in location information.Secondly, in order to solve inaccuracy of location due to position information sparsity,user’s location inference method based on microblog content is presented. First step, byanalyzing user’s tweets presented, extracted geographical words as local words in differentgeographic regions, and calculated the local words’ weights. Then segment under test user’stweets to match the local words which reaching in previous step to infer the user location. Theexperimental results show that the accuracy of province level and city level is68.49%and66.52%, which is better than bases algorithm, gazetteer algorithm and TEDAS algorithm.Finally, aimed at improving the inaccuracy of location inference, a user location inferencemethod based on tweets and bilateral follow friends is presented. This method combines thetweets based location inference method and bilateral follow friends based location inferencemethod, this way can solve the sparsity problem. The experimental results show that thismethod is better than tweets based algorithm, bilateral follow friends based algorithm,gazetteer algorithm and TEDAS algorithm. Meanwhile, the accuracy of province level is81.39%and the accuracy of city level is78.85%when position information is sparse. |