Font Size: a A A

Statistical Inferences For Some Quantile Regression Models Based On Asymmetric Laplace Distribution

Posted on:2018-10-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:F K YanFull Text:PDF
GTID:1310330512981448Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Quantile regression model has been gradually becoming an attractive statistical tool to explore the relationship between a response variable and a set of explanatory covari-ates since the seminal work of Koenker and Bassett(1978).Since a set of quantiles often provides more comprehensive description of the response distribution than the usual mean or median,and is more robust against heavy-tailed error distributions,quantile regression offers a practically important alternative to the classical mean regression or median regression.Quantile regression models have been widely applied in all kinds of field,such as Economy,Engineering,biology.In this thesis,we aim at the statistical inferences for some quantile regression models based on the asymmetric Laplace distri-bution(ALD).The models considered include the quantile regression model,censored quatile regression model,finite mixture quantile regression model,and median regression model with change point,involving several statistical algorithms,such as non-iterative sampling algorithm,stochastic EM algorithm,and the Gibbs sampling algorithm for mixture quantile regression.These algorithms can avoid some shortcomings of sever-al well known algorithms and have satisfying performance in simulation and real data analysis.1.Non-iterative sampling algorithm for quantile regression model Let yi be a response variable and xi a p x 1 vector of covariates for the i-th obser-vation.The linear quantile regression model is given by(?)where ?q is a vector of unknown parameters of interest,?i is the error term whose distribution is restricted to have the q-th quantile equal to zero,and xiT denotes the transpose of xi.Hereafter,a vector or a matrix with the superscript T represent their transpose.So,?q is the q-th conditional quantile regression coefficient of yi given xi,for i = 1,…,n.The regression estimation for ?q is the one minimizing(?)(?)(?)where ?q(·)is the loss function denoted by ?q(u)= u{q-I(u<0)},with I(·)denoting the indicator function.The asymmetric Laplace distribution denoted by ALD(0,?,q)with density function(?)(?)is often used to model the distribution of ?i in the linear quantile regression model,for the fact that minimizing S(?q)is equivalent to maximizing the likelihood function of a linear regression model with random errors following the asymmetric Laplace distribu-tion(ALD).In recent years,Based on the ALD error assumption and its location-scale mixture representation,statistical methods for quantile regression inferences have seen rapid progression.For example,Reed and Yu(2009),Kozumi and Kobayashi(2011)developed efficient Gibbs sampling method for Bayesian quantile regression inferences.Tian et al.(2014)and Zhou et al.(2014)developed the EM algorithm for computing regression quantile.Also,in the Bayesian framework,Luo et al.(2012)applied this idea to study the quantile regression model for longitudinal data,Hu et al.(2013),Zhao and Lian(2013)investigated the bayesian inferences of quantile regression and Tobit quan-tile regression for single-index models respectively,Alhamzawi and Yu(2012),Ji et al.(2012)studied the variable selection problems in quantile regression,tobit and binary quantile regression.Although the EM procedure is a powerful tool to find MLE or posterior mode in the missing data structure,there are many challenges for such likelihood-based inferences,such as finding their standard errors in multi-parameter problems(Meng and Rubin,1991),dealing with nuisance parameters,and having samples of small to moderate size where the asymptotic theory of MLE may not apply.Gibbs sampling algorithm and other MCMC procedures are widely used for Bayesian statistical inference,there are,however,two vital issues regarding such iterative sampler,which are too easily overlooked by users.First,the variates generated in the same iterative process in the Gibbs sampling are hardly ever independent,second,it is harder to check convincingly whether the stage of convergence has been reached upon termination of iteration.Tan et al.(2003)developed a non-iterative sampling algorithm in the EM-type structure based on the inverse Bayes formula(IBF),called IBF sampler/sampling.As opposed to the iterative sampling in MCMC,the IBF sampling was designed to gener-ate independently and identically distributed(i.i.d)samples exactly or approximately from the observed posterior distribution,which can be used for statistical inferences immediately,thus eliminates the shortcomings in the EM and Gibbs sampling.Inspired by Tan et al.(2003),in the second chapter,we propose a non-iterative posterior sampling algorithm for linear quantile regression model with the assumption that the random errors following the asymmetric Laplace distribution.First,utilizing the stochastic presentation of the ALD,we augment the observed data with latent variables following the exponential distribution and obtain the structure of augmented conditional predictive distributions as in the EM algorithm.Then,we choose the best importance ISD by using the preliminary estimates from the EM algorithm so that the overlap area under the target density and the ISD is large.Finally,we combine the sampling IBF and SIR to get i.i.d.samples approximately from the observed posterior distributions.Simulation studies and real data analysis show that the proposed IBF sampling algorithm have better performance than the Gibbs sampling and EM algorithms.2.Stochastic EM Algorithm for Censored Quantile Regression Model In the second chapter,we proposed a stochastic EM algorithm for quantile and censored quantile regression models in order to circumvent some limitations of the EM algorithm and Gibbs sampler.We conducted several simulation studies to illustrate the perfor-mance of the algorithm and found that the procedure performs as better as the Gibbs sampler,and outperforms the EM algorithm in uncensored situation.Finally we applied the methodology to the classical Engel food expenditure data and the labour supply data with left censoring,finding that the SEM algorithm behaves more satisfying than the Gibbs sampler does.3.Gibbs Sampling for Mixture Quantile Regression Model In regression analysis,finite mixture normal linear regression models have been widely used to inves-tigate the relationship between variables coming from several unknown latent homoge-neous groups.But the normality assumption on mixture component errors is sensitive to outliers or heavy-tailed errors.Compared with the mixture normal regression,finite mixture quantile regression model is more robust by modeling the quantiles instead of the mean in different groups,and it also reveals a fuller picture of the data by fitting varying quantile functions.In documents,Wu and Yao(2016)first discussed a kind of mixture quantile regres-sion model to allow regressions of the conditional quantiles on the covariates without any parametric assumption on the error densities.They developed a kernel density based semi-parametric EM-type algorithm to estimate the model parameters.Later,Tian et al.(2016)considered the mixture quantile regression model with the assumption that the errors have the ALD,and developed the EM algorithm from a parametric perspec-tive.Although EM algorithm is powerful in dealing with missing structure model to and MLE or posterior mode,finding their standard errors in multi-parameter problems is not a easy task.When the size of samples is large,based on the asymptotic theory of MLE,one can use the square roots of the diagonal elements for the observed inverse information matrix to estimate the standard errors,but in case of small to moderate size,the asymptotic theory of MLE may not apply.In the third chapter,motivated by Tian et al.(2016),we consider the Bayesian statistical inferences for mixture quantile regression model based on the ALD assumption.Utilizing the hierarchical representation of the ALD and the multinomial distributed mixture component variables,we derived the full conditional distributions needed in Gibbs sampling,under a very general prior settings.The Gibbs sampling procedure is very clear and each step is easy to carry out.Compared with the EM algorithm,the advantage of such algorithm is that one can use the Gibbs samples to estimate the posterior distributions of the parameters and evaluate the standard errors.Simulation studies show that under different combination of error distribution and quantile level,the estimation from the Gibbs procedure have relatively small bias and mean square errors.Finally,we apply this procedure to analyze two real data sets,finding that compared with the mixture mean(Normal)regression,the procedure is more robust to the outliers in data,and can give a more systematic description of the dependent relationship between response variate and covariates in different groups.4.Robust Change-Point Detecting through Laplace Linear Regression Using EM AlgorithmIn the forth chapter,we proposed a robust regression coefficient change point model with the assumption that the errors follow the Laplace distribution.By representing the Laplace distribution as an appropriate scale mixture of normal distribution,we developed the expectation maximization(EM)algorithm to estimate the position of mean change-point.We investigated the performance of the algorithm through different simulations,finding that our procedure is robust to the distributions of errors and is effective to estimate the position of change-point.Finally,we applied our method to the classical Holbert data and detected a change point.
Keywords/Search Tags:Asymmetric Laplace Distribution, Quantile Regression, Non-iterative Sampling, Inverse Bayesian Formula, Stochastic EM Algorithm, Mixture Quantile Re-gression, Change-point Model
PDF Full Text Request
Related items