| For traditional capture of geographic data,methods,such as general investigation of state and field investigation,are generally adopted.However,with continuous development of the society as well as constant changing of residential area,road and other factors,problems of these data collection methods,such as high cost,large workload,and low efficiency and timeliness,are increasingly outstanding.Under the background of constantly developing Internet,Internet-based geographic data keeps increasing and carries rich hidden knowledge.Capturing geographic data from the Internet becomes a new source of geographic information.There is substantial geographic data carried by the Internet.The generation of crawler technology,to some extent,solves the difficulty of Web data capture.However,a universal crawler can barely crawl effective geographic data in the Internet.On the basis of summarizing universal crawler technologies,the crawler technology for Internet geographic data,instead of pursuing large coverage,aims at capturing network data related with Internet geographic information to realize targeted data capture and to remove problems concerning traditional geographic information collection(incl.high data cost,large workload,and low efficiency and timeliness)with crawler technology for Internet geographic information.In the paper,the major research works cover the followings.(1)Analyze and summarize features of websites carrying Internet geographic information.Combine working mechanism of a browser,Analyze information exchange and display patterns of websites carrying Internet geographic information.From the perspective of crawler information collection,websites carrying surface-web geographic information are majorly divided into three types—M-Dom type,M-Render type,and M-Trigger type.By integrating specific experiments,websites carrying deep-web geographic information are analyzed,and features of websites carrying deep-web POI geographic information is majorly studied.(2)Study technology of Internet geographic information acquisition.For the scene of surface-web geographic data collection,methods of a single page and a list page capture is majorly studied.For the scene of deep-web geographic information collection,capture difficulties and capture technologies are summarized,two sets of content search words are designed,and relevant capture strategies are studied.(3)Technical verification and development of prototype system.On the basis of method,technology and strategy studies,a prototype system for Internet geographic data capture is designed,and details of the design is introduced from aspects including system architecture,function,module,and core logic.The prototype system is realized and its application is then verified. |