Font Size: a A A

Empirical Likelihood Confidence Intervals For Mean Of The Response Variable Under Missing Data

Posted on:2007-05-23Degree:MasterType:Thesis
Country:ChinaCandidate:W C PangFull Text:PDF
GTID:2120360212473262Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In this paper, under three cases ,we discuss empirical likelihood confidence intervals of the mean of response variable for missing data under regression model.一. Empirical likelihood confidence intervals of the mean of the response variable under both factor variable and response variable are missing.Consider nonparametric regression model Y = m ( X )+ε(1.1) where X is a d-dimensional random vector of factors and Y is a response variable and m ( i ) is a unknown function ;εis random error and Eε= 0,0 <σ2 = Eε2<∞andεis independent of X. In practice,we often obtain a random sample of incomplete data X i YiδX iδYi,where respectively. AssumeδX is independent ofδY and ( X , Y ,ε) is independent of (δX ,δY), this assumption contains (MCAR)assumption ,that is P (δY =1Y)=P(δY=1)=c1, P (δX =1X)=P(δX=1)=c2,where c1 and c2 are constant .Define several sets we can impute Yi as follows(a) as i∈S rr∪Smr ,don't impute Yi , impute Yi ,where K (i)is a kernel function, hn is a bandwidth sequence that decreases to 0 as n increases to∞. (c) as i∈Smm,use 1 impute Yi,then we obtain Y's complete sample(1.2)To avoid technical difficulties due to small value in the denominator of ,we make some truncation , where bn > 0,b n decreases to 0 as n increases to∞,use and obtain Y's complete sample(1.3) Similarly to Owen(1988),we obtainθ(θ= EY)empirical log-likelihood ratio(1.4) whereλn =λn (θ)satisfying(1.5) The following assumptions are needed. Denote by f ( i )the probability density of X and let(C.f) f ( x )has bounded partial derivative up to order k ( > d). (C.m) m ( x )has bounded partial derivative up to order k ( > d). (C.Y) EY2<∞. (C. gmbn )(C.K) (i)the kernel function K ( i )is a bounded kernel function with bounded support. (ii) K ( i )is a kernel of order k ( > d). (C. hn )(i)THEOREM 1.1 under the assumptions listed above,ifθis the true parameter,we have(1.6) whereχ1 2is a standardχ2variable with one degree of freedom,The asymptotic distribution of l^ n(θ) is not a standard chi-square ,it can not be used to constructθinterval estimator, so we define an adjusted empirical log-likelihood ratio l^ n , ad(θ),see(page 5 )THEOREM 1.2 under the assumptions listed above, ifθis the true parameter ,l∧n , ad(θ)has an asymptoticχ12distribution, that is where二. Empirical likelihood confidence intervals of the mean of the response variable for the same linear regression model under missing data . ( X 1 ,Y1) and ( X 2 ,Y2) are two independent population and ( X i , Yi )is R d×R1random vector,i = 1,2,suppose they have the same linear regression model Y = Xτβ+ε(2.1) whereβis d -dimentional constant vector andεis random error satisfies Eε= 0 and 0 <σ2 = Eε2<∞;εis independent of X. In practice,we often obtain complete data of population ( X 1 , Y1 )denoted as( X 11 , Y1 1 ), ,( X n1 1 , Yn 11)and incomplete data of population ( X 2 , Y2 ) denoted as( X 12 ,*), ,( X n22,*),where*denotes Yi 2is missing. We use complete data of population ( X 1 , Y1 ) to construct least-square estimatorβ∧ofβ(2.2) then use impute Yi 2. Similarly to Owen(1988),we obtainθ(θ= EY2)empirical log-likelihood ratio(2.3) where(2.4) THEOREM 2.1 suppose is the true parameter, we have(2.5) whereχ1 2is a standardχ2variable with one degree of freedom, The asymptotic distribution of not a standard chi-square ,it can not be used to constructθinterval estimator, so we define an adjusted empirical log-likelihood ratio 2^l n , ad(θ),see( page 13 )THEOREM 2.2 suppose ifθis the true parameter,l∧n2 , ad(θ) has an asymptoticχ12distribution, that is P (∧l n2 ,ad(θ)≤cα) = 1 ?α+ o(1) where P (χ1 2≤cα) = 1?α.三. Empirical likelihood confidence intervals of the mean of the response variable for the same nonparametric regression model under missing data . ( X 1 ,Y1) and ( X 2 ,Y2) are two independent population and ( X i , Yi ) is R1×R1 random vector ,i = 1,2,suppose they have the same nonparametric regression model Y = m ( X )+ε(3.1) where m ( i ) is a unknown function andεis random error satisfies Eε= 0 and 0 <σ2 = Eε2<∞;εis independent of X. In practice,we often obtain complete data of population ( X 1 , Y1 )denoted as( X 11 , Y1 1 ), ,( X n1 1 , Yn 11)and incomplete data of population ( X 2 , Y2 ) denoted as( X 12 ,*), ,( X n22,*),where*denotes Yi 2is missing. We use complete data of population ( X 1 , Y1 ) to construct estimator m∧( x) of m ( x )(3.2 ) where K ( i )is a kernel function,h is a bandwidth sequence that decreases to 0 as n1 increases to∞, K h ( X , x ) h ?1 K ( X h? x),then use Y∧i 2 m∧( Xi2) impute Yi 2. Similarly to Owen(1988),we obtainθ(θ= EY2)empirical log-likelihood ratio(3.3) where(3.4)The following assumptions are needed.(1) K ( i )is a twice differentiable symmetric density function.(2) n j h 4→0, n jh→∞, j= 1,2.(3) n1 n2→λ,0<λ<∞.(4)both m (1)( X ) and m (2)( X ) exist, are continuous and bounded.(5)X 1and X 2have common compact support with corresponding twice continuously differentiable density functions (6) EY22<∞THEOREM 3.1 under assumptions listed above, ifθis the true parameter ,we have(3.5) whereχ12is a standardχ2variable with one degree of freedom,The asymptotic distribution of is not a standard chi-square ,it can not be used to constructθinterval estimator, so we define an adjusted empirical log-likelihood ratioTHEOREM 3.2 under assumptions listed above, ifθis the true parameter,l∧n2 , ad(θ) has an asymptoticχ12distribution, that isP (∧l n2 ,ad(θ)≤cα) = 1-α+ o(1) where P (χ1 2≤cα) = 1-α.
Keywords/Search Tags:Empirical likelihood, Missing data, Linear regression, Nonparametric regression, Confidence intervals
PDF Full Text Request
Related items