Font Size: a A A

Biased sampling models with unknown selection function

Posted on:2002-03-06Degree:Ph.DType:Thesis
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Yin, HongFull Text:PDF
GTID:2460390014450099Subject:Statistics
Abstract/Summary:PDF Full Text Request
This thesis concerns statistical inference in biased sampling models. In real-life applications of statistics, the population of interest is often sampled with indirect, missing or other types of biased data due to practical constraints. Biased sampling models describe the relationship between the data and the target population in such situations with a selection function, which can be viewed as the conditional probability of directly observing an individual item out of the target population, given the value of the item. The focus of the thesis is regression analysis in biased sampling models where the selection function is bounded and increasing but otherwise completely unknown. Our investigation provides methodologies for statistical inference about the linear dependency of response variables on covariates and for the prediction of future responses at given values of the covariates.; The problems under consideration have wide applications in biology, geology, epidemiology, astronomy, survey sampling, econometrics and more. For example, in historical seismic records, low magnitude earthquake is underrepresented. Earthquakes with large enough magnitude are certain to be observed, but lower magnitude ones may not be.; Generalized linear models are studied where the distribution of the response variable, conditionally on the covariates, belongs to an exponential family with a bounded, increasing, but otherwise completely unknown selection function. Estimation of the regression parameter and the selection function is investigated through maximum likelihood method and the EM algorithm. The maximum likelihood estimator for the regression parameter is proved to be asymptotically normal and efficient. The nonparametric maximum likelihood estimator for the selection function is shown to converge at a cubic root rate. It is also found that in the absence of covariates, the maximum likelihood estimator for the population mean converges only at a logarithmic rate. A simulation study shows that our method is feasible and the asymptotic theory performs reasonably well.
Keywords/Search Tags:Biased sampling models, Selection function, Maximum likelihood estimator, Unknown, Population
PDF Full Text Request
Related items