Font Size: a A A

Research On The News Events Collection And Analysis Technology Base On Location

Posted on:2016-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:Z H LiFull Text:PDF
GTID:2308330503950770Subject:Software engineering
Abstract/Summary:PDF Full Text Request
This paper studies on the design and implementation of geographic information collection and analysis platform focusing on the news. The system can play pivotal role on news browsing, analysis base on locations of news events. In the process of realization of the system, this paper made three part of researches.The quality of news on the network are often uneven, this paper researches and implements an algorithm to recognition low- value news based on keyword distribution. The algorithm obtain correlation p robability from the distribution of the title keywords in news body. Then use correlation probability to determine whether it is low-value news. Experiments show that the recognition rate of the algorithm can reach 85.71%, higher than the method of topic-sentence similarity calculation which just reach 72%. This algorithm ensures the probability of collecting low-value news can be greatly reduced, it makes the news data having a high purity.Available technology extract locations of news events of news from body can not reach high accuracy. This paper researches and implements an algorithm based on relational tree of elements. First, the algorithm generate relational tree of elements from news title, and then analyse relevance of sentences which containing t he candidate locations. Experimental results show that this extract ion accuracy of the algorithm can reach 87.25%, and the recall rate of 97.8%, both indicators are higher than the algorithm which based on similarity calculation. The new algorithm can process large amounts of news data efficiently and accurately.Base on the two key problems above solved. This paper elaborate how the whole system is designed and implemented. Includes database design, natural language processing, Ajax asynchronous communication, etc. The system contained 12 system modules in total. They are divided into three subsystems : Collection subsystem, analysis subsystem, the service module, Programming respectively by Python, Java Script, and C ++ language.Finally, the comprehensive test of the system can be proved that the system function is fairly complete, and performance can be meet the needs of collection and analysis. Meanwhile, four function indicators are better than existing similar systems, the goal of research is completed.
Keywords/Search Tags:news collection, news filtration, extraction of location, news analysis
PDF Full Text Request
Related items