Font Size: a A A

Semiparametric Models Under Biased Sampling Data

Posted on:2016-10-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:H J MaFull Text:PDF
GTID:1220330470957954Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Survival analysis has become one of the major areas in biostatistics, it also has very important application in other areas, such as reliability theory, actuarial science, demography, epidemiology, social science and economics. Since the complexity of sampling, most real data we obtained are biased, such as the common censored and truncated data, they both can be treated as general biased data. Biased sampling da-ta also arise in many other areas, such as biomedicine, social science, economics and quality control. When the probability of the individual being sampled depends on the values of itself, in other words, different individual has different being sampled proba-bility, then we get the biased data. This is an interesting sampling problem, because it prefer some subjects but ignore others. When the observed data are biased, the statisti-cal procedures for simple data are no longer useful, we must find new methods for this type data. In this paper, we use the estimating equation method to study the general bi-ased data under the semiparametric model, which not only owns the finite dimensional parameters that is easy to explain, but also has infinite dimensional function that add the flexibility.In Chapter1, we first introduce several types of biased sampling data that we will study, include censored data, length-biased data and the data obtained under the case-cohort design. Then we give several semiparametric models that are commonly used in survival analysis, for example, the Cox model, the additive hazards model, the semi-parametric transformation model, the quantile regression model and the proportional mean residual life model.In Chapter2, we exploit the important property of length-biased data, i.e. the trun-cation time and the residual time after enrollment have the same distribution (Huang&Qin,2011,2012), to construct composite estimators under the additive risk model. The resulting estimators are almost twice as efficient as the left-truncated and right-censored estiamtors. We and Cheng&Huang (2014) almost the same time first use composite estimating equation. The large sample properties and finite sample perfor- mance of the proposed estimators are also shown, moreover, we apply our methods to the Channing House data and find the effect is good.In Chapter3, we use the martingale structure of the left-truncated right-censored data and the important property of length-biased data that mentioned in Chapter2to propose the simple estimating equation and the composite estimating equation under the quantile regression model for the length-biased right-censored data. Our methods do not need to estimate the distribution of censored time. Thus compared with Chen&Zhou (2012) and Wang&Wang (2014), our methods reduce the complexity. We estab-lish uniform consistency and weak convergence of the resultant estimators using em-pirical process and stochastic integral techniques. Similar to Peng&Huang (2008), the proposed estimating equations lead to a simple algorithm that involves minimizations of L1type convex functions. The new estimation procedure can be easily implement-ed by adapting existing functions in R. When estimating the variance of the proposed estimator, the covairiance matrix of the limiting process involves the unknown densi-ty functions, estimation of these quantities may be unstable with samples of small or moderate size. So we develop the resampling methods by extending the technique of Jin et al.(2001). Finally, we apply the proposed methods to the Channing House data.In Chapter4, we study the proportional mean residual life model for censored data under the case-cohort design. This study is motivated by a nickel refiners study in the South Welsh where the refiners are interested in knowing how long they can still sur-vive given his current situation. Moreover, the event rate for this study is quite low and the case-cohort design is preferred. Weighted estimating equations are proposed for simultaneous estimation of the regression parameters and the baseline mean residual life function. Then, we conduct simulation studies to examine the finite sample prop-erties of the regression parameter estimators. Finally, the real dataset mentioned above from the South Welsh nickel refiners study is used to illustrate the proposed estimating procedures.In Chapter5, we study the Cox model for length-biased and right-censored data under the case-cohort design. Motivated by the pseudo likelihood developed by Self&Prentice (1988), and the composite partial likelihood considered by Huang&Qin (2012), we propose a simple composite pesudo partial likelihood method. We develop the large sample properties of the case-cohort maximum composite pseudo likelihood estimator and the corresponding cumulative hazard function using a combination of empirical processes and finite population convergence results. We also show the sim-ulation results and use the Oscas Awards data to illustrate the proposed estimating procedure.In Chapter6, we discuss the semiparametric transformation model for length-biased and right-censored data under the case-cohort design. By using the martingale integral representation and the inverse probability weighting methods, Lu&Tsaitis (2006) studied the semiparametric transformation model for right-censored data under the case-cohort design. Even if we can use the martingale integral representation to treat the left-truncation, the resulting estimators under the length-biased sampling is not fully efficient. We use the important property of the length-biased sampling that has mentioned in Chapter2and the inverse probability weighting methods to construct the composite estimating equations. The proposed estimating equations can be solved by the simple iteration methods to calculate the regression parameters and the unknown transformation function. We also give the asymptotic distribution results and their proofs of the proposed estimators. The simulation results and real data analysis are conducted to examine the finite sample properties of the proposed estimators.
Keywords/Search Tags:Left-truncated, Right-censored, Length-biased sampling, Case-cohort de-sign, Additive risk model, Quantile regression model, Proportional mean residual lifemodel, Cox model, Semiparametric transformation model, Empirical process
PDF Full Text Request
Related items