Font Size: a A A

The Study Of Bias And Its Improvement Strategy Of Logistic Regression Model In Epidemiological Etiological Analysis

Posted on:2019-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y YuFull Text:PDF
GTID:2394330542998122Subject:Public Health
Abstract/Summary:PDF Full Text Request
Background:To explore the risk factors of disease and then to infer the cause are the eternal theme of epidemiological research.They are also the main task in preventive medicine practice.However,the most commonly used statistical analysis methods are mostly based on association,and the causality is got through the approximate correlation.The influence of confounding factors is often ignored in the analysis process and leading to the erroneous estimation of causal effects.Although,epidemiologists have proposed a series of statistical methods to control confounding biases,including:restrictions,stratification,and adjustment.But how to use these traditional methods correctly and control the confounding bias effectively by conventional statistical correlation analysis has been troubled.Logistic regression model is the most commonly method to analyze the risk factors in epidemiological studies.As the matter of fact,the estimated value of the logistic regression model is often bised,because it is the conditional probability at the correlation analysis level.How to use the logistic regression model correctly to estimate the actual causal effect of exposure on outcomes accurately is an important issue that needs to be resolved in the etiological analysis.In addition,the causal effect eatimation will also be different if the adjusting sets has different variables.As the number of adjustment variables increasing,the precision of the causal effect estimation will also be affected.How to choose the optimal adjustment variable set to estimate the causal effect of exposure on outcome accurately is also a key issue that must be solved in logistic regression analysis.Methods:In this study,we used statistical simulation,theoretical derivation and real data analysis to systematically study the biased behavior of logistic regression model and to explore the optimal adjustment strategy and its inclusion criteria.For the biased estimate of logistic regression model,we explored using the inverse probability weighted method to construct a logistic marginal structural model to correct the biased estimates of traditional logistic regression model,and obtains unbiased estimation of causal effect.For the selection of the optimal adjusted confounding set,based on the confounding equivalence,we came up with the optimal strategy and general guiding principles by compared the performance of the logistic marginal structural model with the traditional logistic regression model.In order to fully consider the complexity among the confounding variables,this paper constructed four kinds of causal diagram models from simple to complex and getting the confounding equivalence sets according to the necessary and sufficient conditions of confounding equivalence.Then,two kinds of logistic models were used to adjust different confounding sets,and the accuracy and precision of causal effect estimate were evaluated by bias and standard error.In real data analysis,when faceing many confounding factors,it was often difficult to obtain a definitive casual graph and to clarify the true causal effect.The effect of triglycerides on Prediabetes is estimated in a step-by-step manner.By comparing the different performances of the logistic regression model and the logistic marginal structure model,the differences between the two models in the etiological analysis are further illustrated.Results:1.Through theoretical proof and simulation studies,the results were as follows:(1)When the adjusted confounding set met the backdoor criteria,the causal effect estimation of the logistic regression model was mostly biased.Adjusting for the set including all the confounders had the same causal effect estimation to the one containing the parent nodes of the outcome,while the estimate of causal effect after adjusting all exposed parent nodes was not equal to them,but the bias was smaller(higher accuracy),and in most cases the highest accuracy.(2)The logistic marginal structural models adjusting any set that satisfy the backdoor criteria could obtain unbiased estimates of causal effect.Among them,the standard error was the smallest(the highest accuracy)when adjusting all the parent nodes of the outcome.(3)In the case that the causality diagram structure was only partially known,the causal effect estimations of logistic regression model after adjusting the sets that satisfy the Markov boundary were biased and not equal.Adjusting parent nodes of exposure produce less bias.(4)After adjusting the sets satisfies Markov boundary equality through logistic marginal structural model,the causal effect estimations were biased and approximately equal.2.In the real data analysis,we estimated the causal effects of high triglycerides on prediabetes by using traditional logistic regression models and logistic marginal structural models,respectively.High triglyceride in both models were risk factors for prediabetes.As the traditional logistic regression models/logistic marginal structural models adjusting more biochemical indicators and fitness measures,the estimated effect of high triglycerides on the prediabetic effect gradually decreases.When the adjusted confounding factors were same,compared with the logistic marginal structural model,the traditional logistic regression model showed that the effects of high triglycerides on prediabetes were greater.Conclusions:1.Taking consider of the relationship between confounding variables,statistical simulation studies and theoretical derivation for four causality models were conducted.The conclusions were as follows.(1)When the adjusted sets meet the backdoor criterion,the causal effect estimation of the traditional logistic regression model was mostly biased,while the logistic marginal structure model was an approximate unbiased estimate and the accuracy was higher.Therefore,the logistic marginal structural model could be used instead of the logistic regression model for etiological analysis.(2)When the adjusted confounding set satisfied the Markov boundary equality,the estimated values of the causal effects of the two logistic models were biased,but the logistic marginal structural model was relatively stable.Thus,it was still recommended to use the logistic marginal structure model.(3)The criteria for choosing the optimal adjustment strategy were:the logistic marginal structure model was to adjust all the parent nodes of the outcome;the traditional logistic regression model was to adjust all the parent nodes of expouse.2.The real data analysis was consistent with the simulation results.Compared with the logistic marginal structural model,the traditional logistic regression model has a higher causal effect eatimation of the exposure on the outcome.
Keywords/Search Tags:Logistic regression model, Inverse probability weighting based marginal structural model, Simulation study, Causal diagrams, Confounding equivalence
PDF Full Text Request
Related items