Font Size: a A A

Reducing Bias and Increasing Precision in Nonexperimental Studies

Posted on:2015-09-16Degree:Ph.DType:Dissertation
University:Northwestern UniversityCandidate:Tang, YangFull Text:PDF
GTID:1475390017999528Subject:Statistics
Abstract/Summary:PDF Full Text Request
This dissertation is a collection of three papers that develop and validate statistical methods to reduce bias and increase precision in nonexperiment studies. The first two chapters study to what extent precision can be increased and bias can be reduced by adding either pretest measures of the study outcome or a nonequivalent comparison group to the basic regression discontinuity design. The third chapter infers a contingency theory of improving the accuracy of raw ACS Census tract estimates by adding past data from the same tract and contemporaneous spatial data from adjacent tracts.;Chapter 1 derives the power gain when adding either pretest measures of the study outcome or a nonequivalent comparison group to the basic regression discontinuity design. The present paper examines the statistical power of two kinds of comparative regression discontinuity design (CRD) designs relative to both the basic RD and the experiment. One CRD type uses pretest values (CRD-Pre) as the untreated comparison function, while the other uses a non-equivalent comparison group (CRD-CG). We show that: (1) each type of CRD can attain statistical power considerably greater than a basic RD; (2) Under the same sample size, with a very strong pretest-posttest correlation as is found in many applications, CRD-Pre can attain power very close to the experiment's; (3) holding the sample size of the RD and RCT fixed, adding comparison cases to the CRD-CG can achieve almost the same statistical power as the RCT. Many studies with prospective and administrative data permit computing untreated comparison functions, and the present paper adds another argument to the case for CRD as the design of choice in many settings where the basic RD design is now used.;Chapter 2 assesses the performance of CRD-Pre and CRD-CG and conducts six within study comparisons based on three outcomes and two assignment variables using data from the Head Start impact study. We conclude that (I) both RD and CRD designs produce unbiased estimates at the cutoff compared to RCT benchmarks, but find CRD designs have greater power at the cutoff. When certain conditions are met, CRD designs produce unbiased estimates above the cutoff and are at least as powerful as RCT above the cutoff. In contrast, RD cannot generalize the treatment effect away from the cutoff. So CRD designs are strongly recommended to replace RD whenever possible. (II) Both CRD-Pre and CRD-CG are unbiased. However, the power advantage for each design depends on the parameter values - the pretest-posttest correlation in CRD-Pre and the proportion of comparison cases in CRD-CG. Researchers may construct either CRD-Pre or CRD-CG depending on what data is accessible. (III) Although CRD designs can generalize away from the cutoff, they still estimate local effects. Thus, to estimate average treatment effect, researchers should always use RCT. However, when the RCT sample size is too small to produce reliable estimates, or when the treatment effect is heterogeneous across populations, we suggest researchers to consider constructing CRD designs and estimating local effects.;Chapter 3 infers a contingency theory to improve the accuracy of raw American Community Survey (ACS) Census tract estimates by adding past data from the same tract and contemporaneous spatial data from adjacent tracts. We use past data from a given tract and current data from immediately adjacent tracts to create a spatial model with a time covariate (SMTC). We then test how raw ACS estimates are improved by adding just past data, just current spatial data, and then by SMTC. Because amount of improvement depends on the accuracy of raw ACS, we need to learn which kinds of variables ACS measures less well. So we randomly select 34 variables common to the ACS and Census 2000 long form and calculate the correlations between the two, with the Census 2000 values as the benchmark. The less well measured ACS tract variables are those involving low frequency or with a base of persons rather than households. Lastly, we suggest a contingency theory of improving small area estimates based upon five elements: the past data, the current spatial data, the accuracy of raw ACS estimates, and the role of low frequency and person rather than household based rates in determining when ACS estimates are so inaccurate as to need model based improvement. (Abstract shortened by UMI.).
Keywords/Search Tags:ACS, CRD designs, Bias, Precision, Basic RD, Data, RCT, Tract
PDF Full Text Request
Related items