Font Size: a A A

Statistical Inference For Some Single-index Models

Posted on:2018-07-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:W Y LiFull Text:PDF
GTID:1319330542951137Subject:Financial mathematics and financial engineering
Abstract/Summary:PDF Full Text Request
The interest in semiparametric modeling has grown quickly within the last decades.Semiparametric model is flexed enough and well interpretable.It allows easier interpretation of the effect of each variable and may be pre-ferred to a completely nonparametric regression because of the well-known curse of dimensionality.The single-index model is a common semiparametric model and is widely used in biostatistics,medicine,economics,financial econometrics and some other fields.Let Y be a random variable,X be a d-dimensional random vector and ? be the unknown index parameter,the common single-index models are:1.Single-index mean model:the conditional expectation of Y given X is equal to the conditional expectation of Y given XT?,E[Y|X]= E[Y|XT?]= g(XT?),where the function g(·)is unknown.The references are Hall et al.(1993),Hristache et al.(2001),Delecroix et al.(2006),Xia et al.(2002),Cui et al.(2011)and so on.2.Single-index law model:the conditional law of Y given X is equal to the conditional law of Y given XT?,F(Y|X)=F(Y|XT?)= g(XT?),where the function g(·)is unknown.See Delecroix et al.(2003),Hall&Yao(2005)and Ma&Zhu(2013).3.Single-index quantile model:let Q?(YIX)be the conditional ? quan-tile of Y given X with 0 ? ?? 1,then the model can be write as Qa(Y|X)=g(XT?),where the function g(·)is unknown.The references are Chaudhuri et al.(1997),Kong&Xia(2012),Wu et al.(2010),Ma&He(2016)and so on.For there is an unknown function g in the single-index model,the pa-rameter 0 is not unique.Some restrictions are need to keep the identification.One restriction is to keep ||?|| = 1 and set the first element ?1>0,see Lin&Kulasekera(2007).The other one is to fix the first element of ?,such as let ?1 = 1.In this thesis,we take the seconde one.My thesis is about single-index models.In it,we discuss three different single-index models:1)a general single-index assumption(include single-index mean model and single-index law model);2)censoring data with single-index assumption;3)single-index model with additional conditional variance restriction.We propose new inference approaches for such models,prove their asymptotic results and give the confidence interval and hypothesis test.In Chapter 1,we propose a new approach of inference in single-index model.Single-index is a common dimension reduction method.This ap-proach achieves a compromise between a purely parametric and purely non-parametric.Suppose Y is the responds variable and X is the explanatory variable whose dimension is d.If Tu,u?U is a family of transformation functions of Y,then under the single-index assumption,there exists unique?0,such thatE[Tu(Y)|X| E[Tu(Y)|XT?0],(?)u?U.(0.0.2)The vector ?0 is the index parameter which is need to be estimated,and it belongs to the parameter setBc{(?1,...?d):?1 = 1}(?)Rd.In particular,we consider single-index mean regression model and single-index conditional laws model.Our approach is based on the following idea.If for each ??B,f?(·)is the density function of XT?.Letgu(Y,X,?)= {Tu(Y)-E[Tu(Y)|XT?]}f?(XT?),u?U,? ? B.then condition(0.0.2)can be written asE[gu(Y,X,?)|X}=0,(?)u?U if and only if ?=?0.Assume that ?(·)has an integrable,strictly positive Fourier Transform.De-fine the real-valued contrast functionQ(?)= ?u E[gu(Y1,X1T?;?)Tgu(Y2,X2T?;?)?(X1-X2)]d?(u),?? B,(0.0.3)where ? is some probability measure with support U considered with the Borel ?—field.?0 could be identified as the unique root of the contrast Q(·).Our estimation approach is to build a sample based approximation of Q(?)and to minimize it with respect to the parameter ?.In our model the function gu is unknown,we use the kernel estimator of it without the denominators to construct the approximation of Q(?).This kind of estimation allows us to avoid the use of trimming function and to allow the explanatory variables to have a unbounded support.To the best of our knowledge,our method seems to be the only one which allow these two aspects.We show that the estimator ? is consistent and(?)-asymptotically nor-mal.The asymptotic variance of ? has a complicated form.To approximate the law of ? with small and moderate samples,we propose a simulation based approach similar to the one proposed by Lavergne&Patilea(2013).See also Jin et al.(2001).The idea is to build a suitable randomly perturbed version of the criterion Q(?)and to compute its minimum.Conditionally on the original sample,the law of this minimum is shown to be close to the law of?.Then it suffices to repeat the random perturbation procedure many times to derive a simulation based approximation of the law of ?.In addition,we present the results of the several simulation experiments and use real data to evaluate the performance of our new estimator.The empirical results are encouraging.In Chapter 2,we discuss the censoring data.We propose an original idea of reduction of the dimension,using a single-index hypothesis.And then we estimate the direction of the index parameter ? using a SMD type method.Let T be a random variable which takes values in(-?,?].Although in this type of models,it is often assumed that T is greater than or equal to 0,but formally we do not need this constraint.Consider the case that Y is a real value random variable,? is an indicator variable and X? ? is an explanatory variable.The indicator variable tells us whether Y is precisely equal to T or just smaller than T.In other words,?=1 if Y=T and ? = 0 if Y<T.Our object is to estimate the law of T given X.The conditional probability of the event {T = ?} can be positive.The observations could be characterized by the conditional sub-probabilities,H1((-?,t]|x)= P(Y?t,? =1|X = x)H0((-?,t]|x)= P(Y ?t,?=0|X = x),t ? R,x ? X.Then the law of Y isH((-?,t]|x)= P(Y?t|X = x)?Ho((-?,t]|x)+ H1((-?,t]| x).To estimate the conditional distribution of T,we construct the model as follows,there exists a random variable C,the right-censoring time,such thatY = T(?)C,? = 1{T? C}.Under suitable identification conditions(for instance,given X,T and C are independent),the conditional distribution can be expressed as a closed-form expression functional of H0(· | x)and H1(·|x).Then,replacing H0(·|x)and H1{·|x)by their estimator,the conditional distribution can be estimated easily.Such an estimation is called to be conditional Kaplan-Meier estimator.See Beran(1981),Dabrowska(1989),van Keilegom&Veraverbeke(1996).When the dimension of ? is greater than 1,all of these approaches will suffer the curse of dimensionality.In this chapter,we propose a single-index dimensional reduction method,this method can be seen as the extend of the approach in Chapter 1.The originality comes from the fact that this condition is imposed on the observed variables(Y,?).More precisely,on? =Rd,we require that(Y,?)?X|XT?0,with ?0 ? B(?)Rd.To estimate?0,we adapt the approach proposed in Chapter 1 to the case of censored data.Prove the asymptotic results of convergence in probability and asymptotic normality.The confidence interval is constructed by the random perturbation method given in Chapter 1.At the end,we give the estimator of the conditional probability of event {T = ?}.It is important to mention that unlike the existing approaches,such as Bouaziz&Lopez(2010),Xia et al.(2010),Strzalkowska-Kominiak k Cao(2013),in our approach the single-index hypothesis can be easily tested,using the approach in Maistre&Patilea(2014).We will present the results of the several experiments by simulation and use real data to evaluate the performance of the new approach.In chapter 3,we consider about single-index model with additional vari-ance restriction.In applications,models defined by estimating equations for the first and second order conditional moments are widely used.See Ziegler(2011).Here,we consider the extension of the model given in Cui et al.(2011).Let(Y,XT)T be the observation,where Y is a count variable,X is the vector of d explanatory variables.Assume that there exists ?0?R,which is unique up to a scale normalization factor,such thatE(Y|X)= E(Y| XT?0)= r(XT?0;?0),where r(·)is an unknown function and for some real value ?0,Var(Y|X)=g(E(Y|X),?0)= g(r(XT?0,;?o),?0),where function g(·,·)is known,and for each r,??(r,a)is a one-to-one map.We propose a new semiparametric estimation procedure for this single-index regression which incorporates the additional information on the con-ditional variance of Y.This approach is the extension in a semiparametric framework of the quasi-generalized pseudo maximum likelihood method in-troduced by Gourieroux et al.(1984a,1984b).More precisely,we estimate parameter ?0 and function r(·)by a two step pseudo-maximum likelihood(PML)procedure which is based on the densities of linear exponential fam-ilies with nuisance parameter.The densities we use are parameterized by the mean r and a nuisance parameter in variance.We utilize a likelihood type criterion,but when derive the asymptotic results,we don't require the conditional distribution assumption on Y given X.Since the regression function r(·)is unknown,we need a nonparametric estimator when construct the pseudo-maximum likelihood criterion.How to choose the smoothing parameter in the nonparametric estimation is a big problem.The existing semiparametric index regression literature are mainly about the estimation of the index,only few references involve the choice of the smoothing parameter.Even if the smoothing parameter does not influence the asymptotic variance of a semiparametric estimator of ?0,but it may influence the estimator of ?0 and of the regression function.We propose an automatic and natural choice of the smoothing param-eter used to define the semiparametric estimator.For this,we extend the approach introduced by Hardle et al.(1993)(see also Xia&Li(1999),Xia et al.(1999)and Delecroix et al.(2006)).The idea is to maximize the pseudo-likelihood simultaneously in ? and the smoothing parameter,that is the bandwidth of the kernel estimator.The bandwidth is allowed to belong to a large range[n-1/4,n-1/8].In some sense,this approach considers the band-width an auxiliary parameter for which the pseudo-likelihood may provide an estimate.Using a suitable decomposition of the pseudo-log-likelihood we show that such a joint maximization is asymptotically equivalent to separate maximization of a purely parametric(nonlinear)term with respect to ? and minimization of a weighted(mean-squared)cross-validation function with respect to the bandwidth.We show that the rate of our 'optimal' bandwidth is n-1/5,as expected for twice differentiable regression functions.In addition,we give the proofs of asymptotic properties of the estimator.And we present the results of the several experiments by simulation and use real data to evaluate the performance of the new approach.
Keywords/Search Tags:bandwidth selection, Bootstrap, Conditional law, Cure rate, Kernel smoothing, linear exponential densities, Semiparametric regression, semiparametric pseudo-maximum likelihood, Single-index assumption, U-statistics
PDF Full Text Request
Related items