Font Size: a A A

Research And Design Of Information Acquisition Technology Of Internet Public Opinion Analysis

Posted on:2016-10-10Degree:MasterType:Thesis
Country:ChinaCandidate:T H GaoFull Text:PDF
GTID:2298330467993196Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the maturity and development of the science technology, the transmission routes of social public opinion have changed from the traditional media platform to the Internet platform. Nowadays, Internet has become part of public life and the Internet users have covered people of all ages and the public relies more on the Internet than ever before. The wide usage of Internet enables people to know the latest events all over the world and give their comments instantly. Therefore, it is necessary for us to supervise the social public opinion effectively, deleting the malicious information and comments timely to prevent the bad effects on the society.The key technology for the analysis of social public opinion is to crawling. This thesis focuses on some research about how to crawl the information on the web page effectively and then provide useful information source for the analysis of the social public opinion. The information crawling as the key module in social public opinion analysis is achieved by WebCrawler. The traditional WebCrawler forms a URL queue by collecting the URL addresses on the web pages. Then the WebCrawler crawls the URL address one by one using proper crawling strategies to analyze the information on the web pages and eventually complete the task of public opinion analysis. At the age of Web2.0, however, on the websites appear more and more dynamic pages, the code execution of which can change the information and the structure without changing the URL address on the web page. Therefore, some improvements to the traditional WebCrawler should be done to find out the information on the dynamic page.Based on the characteristics and research dynamic page crawling strategy,the main job of this paper are:1) This paper has made a research on the information acquisition technology related knowledge. Web crawler is a key technology of information acquisition in the public opinion system, and Ajax technology in dynamic page has extensive application,therefore, in order to research information crawling technology of dynamic pages, this article has carried on the detailed analysis of this two technology.2) Requirement analysis has been made in this paper on the function of information collection module, and the overall design of module has been completed. Based on the analysis of the function point, the overall process of information acquisition moduledesign, and the key module interface design has been completed.3) This paper has designed detailed information acquisition unit. This paper has be divided into4functional units, they are the page acquisition unit, Ajax code, detection unit, the Ajax code analysis unit and the DOM filter unit. For each unit carried out a detailed process design.4) In this paper, experiments have been conducted to test and functions have been implemented for the information collection module. Through the feedback experimental grab dynamic pages of information, the module grasping function has been tested. This paper implements and displays the information acquisition modulefunction.Through the above work, this paper meet the functional requirements of the dynamic pages of information collection, and enhance the accuracy of the dynamicpage information collection.
Keywords/Search Tags:online public opinion, information acquisition, dynamicpage, document object model
PDF Full Text Request
Related items