Font Size: a A A

Efficient Computational Methods for Large Spatial Data

Posted on:2016-01-07Degree:Ph.DType:Dissertation
University:North Carolina State UniversityCandidate:Parker, Ryan JeremyFull Text:PDF
GTID:1470390017977428Subject:Statistics
Abstract/Summary:PDF Full Text Request
Due to continued advances in technology, the ability to collect and store large data sets are commonplace. In this dissertation we focus on spatial data collected through technologies such as remote sensing, satellites, and computer model output. These data are found in a variety of areas, such as research on the climate or environment, or in areas where complex computer codes are constructed to simulate a scientific process. The traditional methods used to analyze these data can either be very time consuming to run on current computing platforms, or, in some cases, they can be too massive for traditional approaches to even be considered. We propose methods that allow for efficient computing in three different areas of spatial statistics: massive computer model output used in climate research that is observed on a rectangular grid; estimating a nonstationary spatial covariance observed on an irregular grid; and emulating complex computer codes having a high-dimensional input space.;The first problem we consider is that of assessing value added within climate model research. Climate models have emerged as an essential tool for studying the earth's climate, and it is common to run computationally expensive global models for study at a coarse spatial resolution. Regional models are run with boundary conditions taken from the global model to achieve a finer spatial scale for local analysis. These regional models add to the computing expense, and it is of interest to know how much value the finer scale adds over the coarser scale. We propose a new method for assessing the value added by these higher resolution models, and we demonstrate the method within the context of Regional Climate Models (RCMs) from the North American Regional Climate Change Assessment Program (NARCCAP) project. Our spectral approach using the discrete cosine transformation (DCT) is based on characterizing the joint relationship between observations, coarser scale models, and higher resolution models to identify how the finer scales add value over the coarser output. The joint relationship is computed without cumbersome matrix operations by instead estimating the smaller covariance of our data sources at different spatial scales with a Bayesian hierarchical model. Using this model we can efficiently estimate the value added by each data source over the others.;The second method we propose allows us to efficiently estimate nonstationary spatial covariance parameters. Nonstationary covariance models require a large number of parameters, and we develop a fused lasso approach to this problem. We do this by partitioning the domain into a fine grid of subregions, with each subregion having a covariance parameter. Then, we penalize the differences of these parameter values for neighboring subregions. When the L1 norm is used to penalize the absolute differences of parameter values, we achieve fusion that allows us to estimate stationary subdomains. We evaluate this technique with a simulation study and demonstrate it on a data set for tropospheric ozone in the US.;Finally, we develop a computationally efficient method for emulating the output from deterministic computer codes within a designed experiment. By assigning observations to blocks, we develop estimating equations that allow us to estimate Gaussian process model covariance parameters quickly by reducing large matrix operations and exploiting parallelization. We evaluate block assignment strategies with simulation studies for estimation and prediction, finding that assuming independence for larger block sizes works well for estimation. Also, we find that a subset approach for prediction should be preferred in many cases. This approach is demonstrated on an inverse dynamics model for a SARCOS robot arm that has a 21-dimensional input with over 40,000 observations.
Keywords/Search Tags:Data, Spatial, Large, Model, Method, Efficient, Over
PDF Full Text Request
Related items