Estimation And Variable Selection Of Conditional Quantile And Linear Model Under Left Truncated Data

Posted on:2018-02-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:M Yao

Full Text:PDF

GTID:1310330512989850

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

In the data analysis,we often encounter the problems of right censored and left truncated data,which have much important applications in survival analysis,medical statistics,astronomy,economics and engineering reliability statistics.The past literature mainly discussed the problems of the right censored data.In recent years,the problems of left truncated data increasingly attracted attention.There were a lot of literatures to construct the estimators of conditional distribution function,conditional quantile function and regression function,the large sample properties of these estimators were established.In this thesis,we study the esti-mation method of conditional quantile function,parameter estimation and variable selection in linear regression model under the left truncated data.This work will further supplement and improve the relevant methods and theories of the left truncated data.Specific areas involved are as follows.In chapter 2,for the left-truncated model,we construct the weighted double kernel local linear(WDKLL)estimators of the conditional distribution function,conditional probability density function and conditional quantile function under i.i.d.sample,we establish the asymptotic normality of these estimators.It is pointed out that Yu and Jones(1998)proposed the double kernel local linear(DKLL)estimator of conditional distribution function,studied the DKLL estimator of conditional quantile function under complete data.Because the double kernel local linear DKLL estimation method is obtained by local linear method,compared with Nadaraya-Watson(N-W)estimator,their method possessed the same properties as the standard local linear estimator does,such as adaptation of edge effects.As far as we know,for the left truncated model,the double kernel local linear(DKLL)method of the estimators of conditional distribution function and conditional quantile function have not been studied under independent assumptions in the literature.In chapter 2,our aim is to construct the nonparametric estimators of conditional distribution function,conditional probability density function and conditional quantile function by the double kernel local linear approach for the left-truncated and independent data.The asymptotic normality of the proposed estimators are also established.In the case of the left-truncated data,let {(Xk,Yk,Tk),k ≥ 1}be a sequence of independent and identically distributed random vectors which are from(X,Y,T).We assume throughout that T and(X,Y)are independent,where T is the trun-cation variable with continuous df G(·).In the left-truncated model,the lifetime Yi is interfered by the truncation random variable Ti in such a way that both Yi and Ti are observable only when Yi>Ti,whereas neither is observed if Yi<Ti,for i = 1,…,N,where N is the potential sample size.Due to the occurrence of truncation,N is unknown;n is the actually observed sample and n<N.Letθ = PP(Y≥ T)be the probability that the random variable Y is observable.Based oin the double kernel local linear estimaton approach of Yu and Jones(1998),the WDKLL estimator of F(y|x)is defined as Fh1,h2(y|x)=β0,which can be seen as the solution of following optimization problem:Then the weighted double kernel local linear(WDKLL)estimators of ξp(x)is given byBy algebraic simplification,we can getThen,we get a nonparametric estimator of the conditional density f(y|x)as follows where Wh2(·)= W(·/h2)/h2.Then we can establish the asymptotic normality of Fh1,h2(y|x),fh1,h2(y|x),ξp,n(x).The results are Moreover,finite sample behaviors conducted by monte carlo simulations are consistent with our theoretical conclusions.The results of chapter 2 have been published in Communications in Statistics-Theory and Methods.For the left-truncated model,the independence assumption for the observations may be justified in some cases,for example,the data of survival analysis is from a separate group.However,in survival analysis,we encounter a lot of the data that is dependent.It is well known that the dependent data scenario is an important one in a number of applications with survival data.For example,when sampling clusters of individuals(family members,or repeated measurements on the same individual),the sample data obtained from the time records,lifetimes within clusters are typically correlated[see Kang and Koehler(1997)or Cai et al.(2000)].Thus,statistical inference of the left-truncated model is of great theoretical and practical significance under dependence assumption.In chapter 3,for the left-truncated model,we construct the nonparametric(WDKLL)estimators of the conditional distribution function and conditional quantile function by the double kernel local linear approach.We can establish the asymptotic normality of the WDKLL estimators by probability inequalities of mixed sequences and Bernstein big-block and small-block technique when the observed sample is assumed to be stationary α-mixing sequence.Then we can get the results of Fh1,h2(y|x)and ξp,n(x)as follows Moreover,numerical simulation results under finite samples show that our estimator is better than the general kernel estimator,the effectiveness of our method is also verified.The results in chapter 3 have been submitted to Acta Mathematica Sinica,Chinese series.Quantile regression(QR)was first proposed by Koenker and Bassett(1978).It is well-known that QR has been widely used in many typical application areas,such as econometrics,social sciences and biomedicine.The QR method are detailed studied in Koenker(2005).However,QR procedure can result in an arbitrarily small relative efficiency when compared with the LS.To overcome this drawback of QR estimation,the CQR method was proposed by Zou and Yuan(2008)to estimate the regression coefficients in the classical linear regression model.The CQR method inherits the robustness of the QR method and improves the efficiency of the QR estimation.It is an effective and robust parameter estimation method.In recent years,it is popular to research on QR and CQR methods.To the best of our knowledge,the CQR method has not been developed for the left truncated data.In chapter 4,we construct the CQR estimator of regression coefficients in the classical linear regression model.In addition,we consider the adaptive penalized procedure to build parsimonious and robust models and establish the asymptotic normality and Oracle property of these estimators.We consider the linear regression model under the left-truncated datawhere X is a p x 1 random vector of covariates,β is a p×1 vector of unknown parameters,ε is a random error with independent of covariates X.The CQR estimator βCQR of regression parameter vector β can be seen as the solution of following optimization problemWe usc βCQR to construct the adaptively lasso penalty and consider the penalizedCQR estimator pACQR.Then,the adaptively lasso penalized CQR estimatorβACQR can be seen as the solution of following optimization problemWe obtain the asymptotic distribution of CQR estimator βOQR under some conditions(?)Next,we establish the convergence rate and Oracle property of the ACQR estimatorβACQR:((?)-Consistency)(?)(Consistent selection)(?)(Asymptotic normality)(?)Moreover,the simulation studies are conducted to illustrate the finite sample performance of the proposed method.The results in chapter 4 have been submitted to Statistical Papers and accepted with minor revision.For the left truncated data and other incomplete data,a lot of statistical inference problems have not been discussed and solved.The future woks are as follows in chapter five.Firstly,for the left truncated and dependent data,discuss composite quantile regression problem under linear regression model and semipara-metric varying coefficient,partially linear regression model;Secondly,for the left truncated and right censored(LTRC)model,discuss double kernel local linear estimator of conditional distribution function and conditional quantile function,as well as quantile regression problem under the LTRC model.

Keywords/Search Tags:

left-truncated data, conditional quantile, double-kernel local linear estimator, α-mixing sequence, composite quantile regression, variable selection, asymptotic normality, Oracle property

PDF Full Text Request

Related items

1	Weighted Local Linear Estimator Of Conditional Quantile Under Left-truncated And Dependent Data
2	Weighted Composite Quantile Regression For Liner Model Withrandomly Truncated Data
3	Research On The Asymptotic Properties Of Conditional Quantile Estimator Under Functional Stationary Ergodic Data
4	Weighted Local Linear Dual Kernel Estimation Of Conditional Quantile Under Right-censored Dependent Data
5	Robust Variable Selection With Outliers Based On Combined Quantile Regression
6	Parameter Estimation And Variable Selection For Panel Data Quantile Regression Model
7	Composite Quantile Regression For Censored Data
8	Quantile Regression Under Equality Constraint With Factor Variable Selection
9	Research Based On Asymptotic Properties Of Nonparameter Kernel Mode Estimator For Dependent Left-truncated Data
10	The Asymptotic Property Of Extreme Value Estimator And Quantile Estimator