Font Size: a A A

Bivariate Binomial Distribution For Comparing Binomial Observations In Two Correlated Groups

Posted on:2021-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:C F LiuFull Text:PDF
GTID:2370330611997972Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Paired count data arise in various fields in our lives,especially in biomedical researches investigating the treatment effect in experimental group and control group,which have attracted a lot attention from many statisticians.Among the models to fit this kind of data,Poisson model,negative binomial distribution,and binomial distribution are the most widely adopted by current literatures.Comparing the distributional and statistical properties of the three models,researchers found that Poisson model has the shortcoming of fitting paired data with negative correlation,because the correlation coefficient is necessarily positive under assumption of bivariate Poisson model.To overcome this obstacle,many studies attempt to construct mixture models to adapt for different settings of association relationships within the pairs.On the contrast,bivariate binomial distribution allows the correlation coefficient to be positive,zero or negative.Therefore,it is reasonable to fit the correlated observations from two groups with bivariate binomial model.There are two types of bivariate binomial distribution according to the different settings of marginal distribution.Among the literatures focusing on bivariate binomial distribution,the primary interests are usually the applications in clinical trials,manufacturing processes,and time series analysises.Although the basic properties of this distribution has been extensively studied,we find that there are still much room for improvement for estimation methods of the unknown parameters.Due to the fact that parameter estimator does not have a closed form,researches have to apply numerical approaches to compute the solutions of the probability parameter in bivariate binomial distribution.In order to derive the MLEs of a certain distribution which cannot be represented by analytical solutions,a large amount of numerical methods have been developed,among which the Newton Raphson algorithm is the most widely used choice.The basic advantage of Newton Raphson method is its rapid quadratic convergence speed near the maximum value,but it will encounter potential problems in the solution process.First,when the dimension of parameter is very large,or existing data is complicated incomplete,the computations of Hessian matrix at each iteration may become a difficult and tedious work.Second,the Newton Raphson algorithm will fail when the observed information turns to be singular.In addition,the algorithm does not converge if we choose a poor initial value.Therefore,we resort to another type of iterative algorithms calledMinorization-maximization(MM)algorithms to obtain the MLEs in this dissertation.As a more general version of the well-known optimization method,expectation-maximization(EM)algorithms,MM algorithms are in agreement of EM-type ones in the aspect of simple conception,easy implementation and stable numerical performance.Moreover,MM algorithms is a useful alternative when EM algorithms become invalid due to the missing data structure cannot be found.Besides,we also derive the bayesian analysis via Metropolis-Hasting method as a supplemental approach.To get the interval estimates,we adopt bootstrap confidence intervals based on 200 repetitions playing important roles in real data analysis for checking the significance of the derived estimates.Without loss of generality,we mainly discuss about bivariate binomial distribution of type I,and real data analysises are also implemented for the cases of type I.Hypothesis testing is introduced under both parametric and nonparametric assumptions to test the association between the paired observations.In regression analysis,we propose bivariate binomial regression model to apply in situations where response variables occurs as bivariate binomial observations,and can be explained by a series of covariates.Logistic regression is used as the link function for the two marginal mean parameters.Particularly,we use a correlation parameter to account for homogeneous correlation coefficient across all subjects,and derive its MLE via MM algorithm.To evaluate the performance of proposed methods,we conduct Monte Carlo simulations both in non-covariates situations and regression analysises.The generation mode of desired variables relies on the fact that bivariate binomial distribution is actually a representation of quadrinomial distribution with missing data structure.Simulation results indicate the high efficiency and stability of both MM algorithms and bayesian estimation.We present four case studies for proposed methodologies.The first one is the tooth data which evaluates jaw symmetry of human beings.The second one is foot data showing association between the left side and right side.Air quality data in pharmaceutical production process is provided as a typical example in statistical process control(SPC).To explore the cases with covariates,we collect the recent clinical data of patients with the2019 novel coronavirus disease and apply our regression model to illustrate the relationship between the clinical features and underlying covariates.
Keywords/Search Tags:Bayesian method, bivariate binomial distribution, correlated paired count data, MM algorithm, regression analysis
PDF Full Text Request
Related items