Analysis of multivariate data with random cluster size

Posted on:2012-10-19

Degree:Ph.D

Type:Dissertation

University:The Florida State University

Candidate:Li, Xiaoyun

Full Text:PDF

GTID:1450390008991113

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

In this dissertation, we examine binary correlated data with present/absent component or missing data that are related to binary responses of interest. Depending on the data structure, correlated binary data can be referred as clustered data if sampling unit is a cluster of subjects, or it can be referred as longitudinal data when it involves repeated measurement of same subject over time. We propose our novel models in these two data structures and illustrate the model with real data applications.;In biomedical studies involving clustered binary responses, the cluster size can vary because some components of the cluster can be absent. When both the presence of a cluster component as well as the binary disease status of a present component are treated as responses of interest, we propose a novel two-stage random effects logistic regression framework. For the ease of interpretation of regression effects, both the marginal probability of presence/absence of a component as well as the conditional probability of disease status of a present component, preserve the approximate logistic regression forms. We present a maximum likelihood method of estimation implementable using standard statistical software. We compare our models and the physical interpretation of regression effects with competing methods from literature. We also present a simulation study to assess the robustness of our procedure to wrong specification of the random effects distribution and to compare finite sample performances of estimates with existing methods. The methodology is illustrated via analyzing a study of the periodontal health status in a diabetic Gullah population.;We extend this model in longitudinal studies with binary longitudinal response and informative missing data. In longitudinal studies, when treating each subject as a cluster, cluster size is the total number of observations for each subject. When data is informatively missing, cluster size of each subject can vary and is related to the binary response of interest and we are also interested in the missing mechanism. This is a modified situation of the cluster binary data with present components. We modify and adopt our proposed two-stage random effects logistic regression model so that both the marginal probability of binary response and missing indicator as well as the conditional probability of binary response and missing indicator preserve logistic regression forms. We present a Bayesian framework of this model and illustrate our proposed model on an AIDS data example.

Keywords/Search Tags:

Data, Cluster, Present, Binary, Missing, Regression, Random, Model

PDF Full Text Request

Related items

1	Estimation And Variable Selection Of Regression Models With Missing Data
2	Asymptotic Properties Of Functional Nonparametric And Semi-parametric Regression Model Estimation With Responses Missing At Random
3	EM Algorithm For Binary Markov Chains Of Longitudinal Data With Missing Data
4	Empirical Likelihood Inferences For Two Classes Of Statistical Models With Missing Responses
5	Statistical Inference For Clustered Data Under Missing-at-Random Mechanism
6	Missing Data, Linear Regression Statistical Inference
7	Expectation Estimator In Missing Data
8	Generalized Functional Regression Model And Missing Data Model
9	Research For The Saturated Model On Exchangeable Missing Binary Data
10	Variable Selection And Parameter Estimation For Heterogeneous Regression Model With Missing Data