Font Size: a A A

Spatial approaches to reducing error in geocoded data

Posted on:2011-08-21Degree:Ph.DType:Dissertation
University:University of Southern CaliforniaCandidate:Goldberg, Daniel WrightFull Text:PDF
GTID:1440390002465419Subject:Physical geography
Abstract/Summary:
The process of geocoding, converting geospatial textual information into one or more representative geographic locations or areas, is a fundamental geospatial operation essential to many diverse scientific fields such as environmental epidemiology, homeland security, sociology, political science, and transportation logistics. These geocoded data typically form the underlying data from which geographic mapping and visualization can occur, spatially-based research questions can be posed and investigated, and network routing and planning can be conducted. Although varied and diverse in terms of their applications and usages, this wide set of geocode users all require spatially accurate geocoded results as well as metrics capable of describing the accuracy.;The current state-of-the-art in geocoding technology is often capable of producing industry-accepted spatially accurate geocoded results under the ideal situations when input data are of high quality and expensive reference data layers are available. For most consumers of geocoded data and geocoding tools, difficulties in producing high quality data still remain. Even more so, the existing methods used to describe the quality of these data are severely deficient for use in scientific studies.;In this dissertation, I describe a geocoding system that addresses many of the fundamental underlying problems that cause inaccurate spatial results using uncertainty minimizing techniques. Specifically, we present a set of novel advances to geocoding algorithms which both increase the spatial accuracy of the output data and reduce the spatial uncertainty inherent in this information. To accomplish these tasks, we first outline a strategy for picking nearby candidate geocodes when a specific known reference feature is not available for a particular input address. This approach uses a spatially-varying block distance metric to define a local region of interest within which candidates are scored based on their spatial distance and attribute similarity. I next develop the concept of a spatial uncertainty-driven approach to candidate feature selection, integration, and interpolation that uses the characteristics of all available candidate features, their individual uncertainties, and their topological relationships to deduce the most likely candidate outcome. We finally turn our attention to the case of ambiguous results and develop both rule- and spatial neighborhood-based approaches for choosing the appropriate candidate feature based on the relationships between ambiguous candidates and the characteristics of the local regions around them.;Together, these three branches of the research presented serve to increase match rates (the number of successfully geocoded results), reduce spatial error (the distance from the computed output location to the ground truth position), and reduced spatial uncertainty (the number/scale of equi-probable locations to which a geocode could belong) in geocoded information. These advances increase the quality of geocoded data used in scientific studies and will play a key role in developing the next generation of spatial analysis approaches that utilize spatial uncertainty-based approaches to understanding geospatial phenomenon across scientific disciplines.
Keywords/Search Tags:Spatial, Data, Geocoded, Approaches, Geocoding, Scientific
Related items