Font Size: a A A

Some Problems Of Estimating A Distribution Function With Missing Data

Posted on:2008-01-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:F Z LuFull Text:PDF
GTID:1100360215484167Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The missing data phenomenon is universal: Respondents in a household survey may refuse to report private information, In an industrial process some results are missing because of mechanical breakdowns, Some data are missing with or without intention in economic or commercial activities, Patients do not show up on schedule, The recorded data are not complete in clinical experiments due to subjective or objective factor. And so on.Let Y denotes the random variable associated with the data, The aim of statistical analysis under missing data circumstance is to infer the characteristics of the variable Y, such as mean, quantile, distribution function. Etc. Since distribution function is the most comprehensive and thorough feature of the variable Y, This dissertation is devoted to study this problem: How to estimate a distribution function with missing data. Obviously, Solution to the problem will not only be of great theoretical significance, but also be of extensive application background.There have been exploration on the problem in the literature all the way. If the missing data mechanism is known up to some unknown parameters and the distribution function is assumed of parameter model, There have been moderate mature solution to the problem, see Little & Rubin [14] and the essays cited there. If the missing data mechanism is known, Huh Lawless [11] introduced a method named non-parametrical maximum likelihood estimation to estimate the underlying unspecified distribution function. If the missing data mechanism is known up to a unknown scalar parameter and the distribution function is unspecified, Leigh [13] obtained a so called semi-parametrical maximum likelihood estimation.However, Many aspects of the problem is unsolved, especially, There is a blank when the random variable Y is a multi-dimension random vector with unspecified distribution and the missing data mechanism can not be determined via a scalar parameter.Due to the complexity in nature of the missing data mechanism, It is hard to seek an uniform way to solve the problem, This essay studies these aspects of the problem which have not yet been solved. They are:In Chapter two, We introduce a procedure to estimate the underlying distribution function of a two-dimension random vector (X, Y), where the missing data mechanism is known up to a k-dimension unknown parameterθ, Estimators of the parameter and the underlying distribution are constructed, and the asymptotical properties of the estimators are studied.In Chapter three, We assume we can know for sure whether the missing data belong to some prescribed intervals or not, And the missing data mechanism is known up to a k-dimension unknown parameterθ, We derive the estimators and study their asymptotical properties.We obtain a consistent estimator of the underlying distribution of a discrete random variable Y in Chapter four, on the premise of some information carried by another random variable X associated with the variable Y.When the missing data mechanism is MAR but unspecified, We introduce a method to estimate the underlying distribution function F(x, y) in Chapter five.Chapter six presents a two-waves sampling method to estimate a distribution function of a discrete random variable Y. where we assume the two missing data mechanisms is alike in functional form and a distinction of a unknown scalar parameter is allowed between the two missing data mechanisms.
Keywords/Search Tags:Missing data mechanism, Identification, Distribution function
PDF Full Text Request
Related items