Font Size: a A A

Study On The Ontology-based Extraction Of The Names Of Chinese Administrative Division

Posted on:2012-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:P DuFull Text:PDF
GTID:1115330335466451Subject:Cartography and Geographic Information System
Abstract/Summary:PDF Full Text Request
The number of web pages has been growing rapidly with the development of World Wide Web. However, a huge quantity of geographic information resources is hidden in the billions of web pages and waits to be mined. Fully exploiting the geographic information on the web not only meets people's geographical query and retrieval needs, but also contributes to Location-Based Services(LBS) and other emerging fields. The Chinese place names are a kind of major geographic information resources on the web. In this study, the names of Chinese administrative division are extracted from web pages based on a series of basic theories and methods, such as natural language processing, geo-ontology, eliminating geo/non-geo and geo/geo ambiguities, and geo-visualizing representation.At present, many researches on extraction of Chinese place names just stand at the viewpoint of natural language processing, stopping at the preliminary recognition. These researches lack disambiguation of ambiguous place names, making the results of extraction can not be used in geographic information services. Although some scholars have engaged in the study of geographical spatio-temporal ontology or recognition of Chinese place names, there was no any clear comment and detail theory about the combination of these two areas together organically, while focusing on the disambiguation of place names. This dissertation establishes a better theoretical framework on Chinese place names recognition and extraction based on place name spatio-temporal ontology. A prototype system is designed and implemented based on the framework.The main results of this study include:①On the basis of introduction and review of ontology, geo-ontology. spatial ontology, etc., a model of place name spatio-temporal ontology which consists of BFO-SNAP and BFO-SPAN is designed based on Basic Formal Ontology using mereology, location theory and topology, and a Chinese administrative division spatio-temporal ontology which can express changes and time characteristics of the evolution of place names formally is constructed.②The names of Chinese administrative division extraction prototype system is designed and implemented using the method of ontology-based information extraction under GATE environment. The system turns the names of Chinese administrative division which are indirect geospatial information to precise geographical coordinates, removing the semantic barriers between unstructured spatial information in natural language and GIS structured spatial information to a certain extent.③After analyzing the characteristics and causes of the ambiguities existed in the names of Chinese administrative division, the ambiguities are divided into two types:geo/non-geo and geo/geo. The geo/geo ambiguity is further divided into two categories:places with the administrative relationship using the same special names, places without the administrative relationship using the same name.④Two effective algorithms are designed in order to eliminate widespread ambiguities in the names of Chinese administrative division in web texts. The names of Chinese administrative division which have geo/non-geo ambiguities are not extracted while those have geo/geo ambiguities are extracted and specified unique locations.⑤Rich semantics and precise geographical coordinates are given to the extracted names of Chinese administrative division which are unambiguous according to Chinese administrative division spatio-temporal ontology, then the names of Chinese administrative division are plotted on a map to visualize.
Keywords/Search Tags:place name spatio-temporal ontology, the names of Chinese administrative division, extraction, geo/non-geo ambiguity, geo/geo ambiguity, disambiguation, geo-parsing, geo-coding
PDF Full Text Request
Related items