Font Size: a A A

Unification of Randomized Response Designs and Certain Aspects of Post-Randomization for Statistical Disclosure Control

Posted on:2012-05-21Degree:Ph.DType:Dissertation
University:The George Washington UniversityCandidate:Adeshiyan, Samson AFull Text:PDF
GTID:1469390011962671Subject:Statistics
Abstract/Summary:
This dissertation deals with two closely related topics - randomized response (RR) surveys and post-randomization - that are concerned with survey respondents' privacy protection and confidentiality. First, we present a common framework for discussing various RR surveys of dichotomous populations with polychotomous responses. The unified approach addresses both respondents' privacy and statistical efficiency and is helpful for fair comparison of various procedures. We describe a general technique for constructing unbiased estimators of the proportion (pi) of the population that belongs to a sensitive or stigmatized group based on arbitrary RR procedures, from unbiased estimators based on an open or direct survey with the same sampling design. The technique works well for any sampling design p(s) and also for variance estimation. We develop an approach for comparing RR procedures, taking both respondents' protection and statistical efficiency into account. For any given RR design with three or more response categories, we can find RR procedures with a binary response variable which provide the same respondents' protection and at least as much statistical information. This result suggests that RR surveys of dichotomous populations should use only binary response variables.;In many situations there may be more than two natural population categories, so we also investigate RR surveys for polychotomous populations, with k categories of which at least one is sensitive or stigmatized. We extend the theory and framework for RR surveys of dichotomous populations to RR surveys of polychotomous populations, including estimation in finite population settings. We also discuss comparison of polychotomous RR designs where only one category is sensitive.;The second topic is post-randomization (PRAM), which is a statistical disclosure control technique for categorical variables. The PRAM stochastically transforms each record in a microdata set using pre-selected probabilities. We demonstrate that any PRAM procedure can be regarded as a PRAMing of the cross-classification of all the variables in the data set. We discuss some connections to RR surveys and note that the estimators developed for RR surveys are applicable for estimation from PRAMed data. We focus on a special case of PRAM, known as invariant PRAM and introduce the notion of a strongly invariant PRAM. The invariant PRAM is attractive in that in the strong situation, the PRAMed data can be analyzed without adjustment for post-randomization. We review methods for constructing invariant PRAM matrices, clarify certain misconceptions about invariant PRAM, and discuss estimation from an invariantly PRAMed microdata set. Finally, we examine the effectiveness of PRAM for limiting statistical disclosure.
Keywords/Search Tags:Statistical disclosure, PRAM, RR surveys, Response, Post-randomization, RR procedures, Estimation
Related items