Font Size: a A A

Comparison and evaluation of the effect of outliers on ordinary least squares and Theil nonparametric regression with the evaluation of standard error estimates for the Theil nonparametric regression metho

Posted on:1999-08-26Degree:Ph.DType:Dissertation
University:Lehigh UniversityCandidate:Wasser, Thomas EmersonFull Text:PDF
GTID:1460390014973866Subject:Mathematics
Abstract/Summary:
Introduction. Detection of outliers in Ordinary Least Squares (OLS) Regression is important for researchers who want to prevent spurious values from affecting slope and intercept estimates. Visual inspection, and removing values that 'look' like outliers may introduce selection bias. Through the use of a simulation study, this dissertation evaluates the accuracy and efficiency of the OLS versus the Theil non-parametric regression method in the presence of outliers, across small sample sizes and different correlation levels. In addition the study tests the Tukey standard error of the median, the Kendall's tau, and the Bootstrap for use as a standard error for the Theil procedure.;Methods. Simulated data sets were generated in three correlation levels (rho = 0.50, rho = 0.75, and rho = 0.90) linked with three sample sizes (n = 5, n = 15, and n = 25). Outliers were added to various positions in the data sets and OLS and Theil regression methods were calculated on all data sets. The slope and intercept estimates were compared back to the simulation specifications to determine accuracy. In addition the three standard error methods were tested against the simulation estimates of error for the Theil procedure to determine whether they provided accurate enough estimates to be useful. Finally, the simulation standard error estimates for the Theil and OLS estimates of slope and intercepts were compared to determine which procedure was relatively more efficient.;Results. Both OLS and Theil regression estimates were accurate in situations when no outliers were present regardless of correlation level and sample size. When outliers were present in the data the Theil procedure always provided more accurate estimates than OLS, however when outliers were in the tails of the distribution and the samples were small these Theil slope and intercept estimates were not useful. Differences between simulation values and OLS and Theil estimates are smaller as correlation and sample size increases. In general, when no outliers are present OLS estimates were more efficient, while when outliers were present the reverse was true. Standard error estimates for the Theil procedure demonstrate that Bootstrap and Tukey's method provide similar results, however these are often not useful because of the great difference between standard error estimates and simulation values. Kendall's Tau was not found to be useful.;Conclusions. When outliers are present, both OLS and Theil procedure provide useful estimates of both slope and intercept. When outliers are present, the Theil procedure should be used, but caution should be used when outliers are in the tails of the 'y' variables. Bootstrap standard errors are generally more accurate for larger sample sizes, but are not accurate when samples are small. In small 'n' situations the Tukey method is more accurate for both slope and intercept. In general, no universal recommendation for a standard error suitable for the Theil procedure can be made.
Keywords/Search Tags:Standard error, Theil, Outliers, Regression, OLS, Slope and intercept, Values
Related items