| Geographic information plays an important role in many application fields including people’s daily life. In recent years, with the development of Internet’s development and perfection, the integral dose of geographic information data increases in the form of exponential type explosively. It has been studied by many scholars that how to obtain the required geographic information through data mining. Spatial data sources can be divided into two types:structured data sources and unstructured data sources. A number of researchers have retrieved data from the structured sources. They download the geographic data file by links or by interfaces. These well-structured sources provide static information. In contrast, the unstructured nature of webpages allows them to change frequently, and up-to-date information about places is often available on the Web.This paper conducts a research on geographic information acquisition and integration from two aspects respectively, which are structured data sources and unstructured data sources. For the unstructured data on the web, this paper proposes a method which is based on web and can mining data of name and address. While for structured data, aiming at the problems existing in different data sources, this paper conducts a research on geographic information from different sources and eliminates the semantic heterogeneity and heterogeneity. Then this paper merges many characteristics together to proceed similarity measure and uses BP Neural Network which is a kind of machine learning method in data mining to achieve the goal of linking related data, which can lay a good function for geographic information integration. The feasibility of the two ways can be proved through experiments, and the related technical indicators show that it has a high accuracy. Finally, we analysis the possibility of obtaining geographic information data from the large number of unstructured data on the web and matching integration the extractived geographic information from structural data from different data origin through data mining, it provides a direction for future study. |