Font Size: a A A

A Simulation Study Of Logistic Regression And Rare Events Logistic Regression Model

Posted on:2006-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:X Y YangFull Text:PDF
GTID:2144360155973563Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objectives To explore the stability rule of logistic regression and rare events logistic regression model at different values of events per variable(EPV), so as to establish basis for appropriate application of these two models and provide reference for similar study of other statistical models.Methods Monte Carlo methods are applied to discuss stabilization of logistic regression and rare events logistic regression based on two cross-sectional study data, which are accomplished by computer procedure compiled by FoxPro. By varying events per variable(EPV) at point 2, 4, 6, 8, 10,12,14,16, 18, 20,25, 30, we get 500 samples at each EPV point for each regression model, and fit respectively logistic regression and rare events logistic regression according to each sample. The indexes such as frequency distribution of coefficient, bias, precision, empirical coverage probability and power are used to evaluate model stabilization.Results It shows that the Monte Carlo research results based on the two cross-sectional data are similar: EPV will affect the validity of the model, on condition of EPV<12/10(logistic regression / rare events logistic regression), the estimated coefficients will be biased in positive or negative directions, the less theEPV is, the more bias of the coefficients are; Variation of the coefficients will increase with EPV decreases; 95% confidence interval of the estimated values will not have proper coverage, and the power of the regression model will decrease.Conclusions Multivariable analysis such as logistic regression and rare events logistic regression can not assure the validity of the results when too few outcome events are available relative to the number of independent variables analyzed in the model (such as EPV<10), in another word, it is necessary to pay attention to problem of overfitting of the model.
Keywords/Search Tags:Monte Carlo, Logistic regression, Rare events logistic regression, Coronary Heart Disease (CHD), Risk factors
PDF Full Text Request
Related items