Font Size: a A A

Bias-variance Error Decomposition for Data-driven Geospatial Modeling

Posted on:2014-06-09Degree:Ph.DType:Dissertation
University:The University of Wisconsin - MadisonCandidate:Gao, JingFull Text:PDF
GTID:1450390005488237Subject:Geography
Abstract/Summary:
Careful model evaluation is essential when using data-driven geospatial models. A useful evaluation should help the analyst (1) understand and (2) improve model performance. Commonly used error analysis methods provide limited help for accomplishing these goals. Hence, I propose to use the bias-variance (BV) error decomposition in geospatial modeling. This approach decomposes the expected model error into bias (systematic error), variance (model sensitivity to variations in training data), and noise (unavoidable error). Originating in statistics and machine learning, BV analysis has proven useful for achieving the aforementioned two goals of model evaluation, and it has been used to compare different error metrics. However, it has not been tested for geospatial models. This research investigates the BV decomposition of three error types relevant for geospatial modeling (squared error, absolute error, and categorical error), through both analytical inquiry and case studies. It is the first research to analytically derive the BV decomposition for absolute errors, the first to explore the usefulness of BV decomposition for geospatial models, and the first to investigate the implications of using different error definitions in geospatial model evaluation. My results showed that the benefits of BV analysis demonstrated in statistics and machine learning apply for geospatial modeling. Additionally, the BV decomposition can reveal new insights about the modeled geospatial process; mapping bias can help identify and delineate model spatial non-stationarity; and mapping variance can help predict the effects of ensemble methods and guide training sample collection. All of these may assist the development of model improvement strategies. Further, squared, absolute, and categorical errors can potentially lead to different model evaluation conclusions for the same model. In practice, it can be beneficial to use both squared and zero-one errors for geospatial classification models, but for geospatial regression models a careful choice between squared and absolute errors is recommended. Interestingly, in my results, the widely accepted bias/variance tradeoff did not always emerge, but the cause of this phenomenon is not clear. Finally, the geospatial context also deepened the theoretical understandings about data-driven modeling in general, especially regarding the effects of ensemble methods and effective model training.;Key words: bias-variance error decomposition, modeling errors, model evaluation, uncertainties, geospatial modeling, classification models, land cover / land use change, regression models, environmental remote sensing.
Keywords/Search Tags:Geospatial, Model, Error, Decomposition, Data-driven, Bias-variance
Related items