Font Size: a A A

Application Of Robust Multiple Linear Regression In Geographic Data Processing

Posted on:2013-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:X H HanFull Text:PDF
GTID:2230330371990732Subject:Cartography and Geographic Information Engineering
Abstract/Summary:PDF Full Text Request
Multiple linear regression is commonly used mathematical method in establishing geographic analysis models. Statisticians said that the probability of appearing gross error is about1%~10%in the production practice and the collected data of scientific experiments. In order to eliminate or weaken the effects of gross errors on parameter estimation, G.E.P.BOX proposed the concept of robust estimation in1953. Robust estimation theory is based on the actual distribution, rather than the ideal distribution, of data. Appropriate methods are adopted to ensure that the estimated values of parameters are unaffected by unavoidable gross errors. Optimum estimated values are targeted under the normal mode. Robust multiple linear regression can efficiently eliminate or weaken the influence of gross errors on parameter estimation when gross errors exist in observations inevitably. The extents of gross errors eliminated by robust multiple linear regression are different with robust estimation methods themselves and different observations of specific problem. The current paper compares the capability of commonly used robust estimation methods in eliminating or weakening gross errors through simulation experiments. This paper confirms extent of gross errors eliminated (EGEE) by robust estimation methods for dealing with multiple linear regressions, as well as the minimum number of observations needed to eliminate gross errors in certain ranges completely.This paper presents a new approach to determine EGEE by robust estimation method and specific calculation method. Taking multiple linear regression (2-5) as examples, this current paper uses simulation experiments (1000times) to compare13frequently used robust estimation methods. Several additional efficient robust estimation methods are confirmed for dealing with multiple linear regressions. Finally, the minimum number of observations needed for eliminating completely gross errors (3.0-8.0σ0) is also confirmed. In summary, the L1method, German-McClure method, IGGIII scheme, and Danish method are comparatively more effective methods among the14robust estimation methods. When the observations contain one gross error, the minimum observed numbers of the binary, ternary, quaternary, and five-element linear regressions that fully eliminate the influence of gross errors (3.0-8.0σ0) are7,8,10, and11, respectively. When the observations contain two gross errors simultaneously, the minimum observed numbers of binary, ternary, quaternary, and five-element linear regressions that fully eliminate the influence of gross errors (3.0-8.0σ0) are10,12,15, and17, respectively.Simple linear regression is one of the most widely used methods of parameter estimation. This paper proposes a bidirectional golden section based on independent variables according to arithmetical progression, which increases the redundant observations of the observations at both endpoints and narrows the difference of redundant observations among the observations. Under the premise of not increasing the number of observations and changing observation accuracy, this method improves the capability of robust estimate method in eliminating or weakening gross errors.Usually, the solution for multiple linear regression coefficient solution is the least square method, but in actual application still appearing another case, the phenomenon of mult ico linearity among variables often seriously influences the parameter estimation. In this respect, for a variety of practical problems, Hore, Massy, Webster, and Stein introduced Ridge regression, Principal Component Regression, Shrinkage estimator, and Robust latent root estimator of regression coefficients, respectively, to weaken the effects of gross errors on multicolinearity. According to this problem, the paper summed up the commonly used diagnoses and methods of eliminating the influence of multicolinearity. Common diagnoses of multicolinearity mainly are Latent root, Variance inflation factor, Tolerance value, etc. Eliminating methods mainly are Ridge Regression, Principal Component Regression, Partial least squares estimate, and so on.
Keywords/Search Tags:multiple linear regression, robust estimation, geographic datacomparison of methods, extents of gross errors eliminated, the minimumobserved numbers
PDF Full Text Request
Related items