| Causality has always been the focus of biomedical research.How to effectively avoid confounding bias and accurately estimate the causal effect has become a key issue that must be solved in observational studies and clinical trials.In observational studies,when the causal diagram is unknown and only limited knowledge is available,it is often impossible to accurately identify variables(mediators,collider,and confounders)that are both related to exposure factors and outcome.Assuming that mediators and confounders cannot be correctly identified and can be used to analyze data using logistic regression models,what are the outcomes if the mediators are misinterpreted as confounders?In addition,in clinical trials,investigators often use surrogate endpoint test results for long-term follow-up of clinical outcomes,or for patients who find it difficult to receive invasive testing or expensive testing.By randomizing the design stages,the confounding influence of between treatment factors and outcome is eliminated,thereby effectively avoiding confounding biases.However,it is often difficult to randomize surrogate endpoints.Confounders between surrogate endpoint and outcome will inevitably affect the estimation of causal effects,and may even produce confounding paradox.Therefore,this paper intends to explore the effect of causality on the above two scenarios based on the causal diagram.For the former,a logistic regression model used to misadjust the mediator variable and observe the effect of exposure on the outcome causal effect.For the latter,the instrumental variable method and logistic regression model are used to estimate the causal effect of the surrogate endpoint on the outcome and then compare them,in order to use instrumental variable approach to avoid confounding paradox.Methods:1.The logistic regression model was used to adjust the mediator to estimate the effect of the exposure on the outcome.The theoretical value is calculated from the causal inference using the do-caculus,and then obtain the estimated bias.Changing the effect of exposure→mediator,and the effect of mediator→outcome,observed and compared the changes of the biases.Through the theoretical derivation,we prove the size of the bias after adjusting the mediator.2.Based on the causal diagrams,the do-caculus was used to calculate the theoretical value of the causal effect of the treatment factors on outcome and surrogate endpoint on outcome,the calculation and comparison the estimation effect of treatment on outcome and the effect of surrogate endpoint on outcome.Firstly,using a logistic regression model estimate the effect of treatment factor on outcome;secondly,estimate causal effect of outcome for different categories of surrogate endpoints using different methods:for the discrete surrogate endpoint,using the logistic regression model and two stage residual inclusion(2SRI)of instrumental variable method;for the continuous surrogate endpoint,using logistic regression model and the adjusted instrumental variables(Adjusted IV).Comparing the estimation effect with the theoretical value,we must pay attention to the sign of the effect of the surrogate endpoint on the outcome,in order to judge whether or not there is confounding paradox.Results:1.The consequence of mistakenly adjusting for mediator in causal effect estimation(1)Regardless of the confounding factors in the causal diagram model,logistic regression models mistakenly regard the mediators as confounders to adjust,which led to the bias of the estimation of causal effects.In the absence of confounder,the causal effect was underestimated in most cases;in the presence of confounder,the causal effect was overestimated.(2)Changing the effect of exposure →mediator,or the effect of mediator→outcome,would affect the estimation of total effects.In the absence of confounders,the bias of the former led to greater and more sensitive(the curve is steeper).When there are confounder,the bias of the latter was greater,but the bias of the former was still more sensitive.(3)The theoretical derivation proved that the logistic regression model regarded the mediator as confounder to adjust,which estimation effect of exposure on outcome was biased.2.Confounding paradox and its estimation of causal effect(1)The logistic regression model was obtain an unbiased estimation of the effect of the treatment factor on the outcome,which the sign of estimation effect was consistent with the sign of the theoretical value.(2)When the effect of the confounder on the outcome was negative,regardless of whether the surrogate endpoint variable type was discrete or continuous,logistic regression model had a biased estimation of the surrogate endpoint on outcome,and the sign of the estimated effect vaule was inconsistent with the causal effect theoretical value.Firstly,for the discrete surrogate endpoint,the instrumental variable method 2SRI can obtain its asymptotic unbiased estimate,which was consistent with the sign of the theoretical value,and would not produce confounding paradox.For a continuous surrogate endpoint,the effect of the adjusted instrumental variable method was approximately equal to the theoretical value.And the sign of the estimated effect value was consistent with the theoretical value,and there would be no confounding paradox.Conclusions:1.When using logistic regression models to estimate the effect of exposure on outcome,it was necessary to correctly identify confounders and mediators.If the mediator was mistaken for the confounder to adjust,the estimation effect would be biased.2.For the discrete and continuous surrogate endpoint,if there were unobserved confounders in surrogate endpoint and outcome,the logistic regression model may produce confounding paradox.The confounding paradox can be avoided by using the instrumental variable 2SRI or adjusted instrumental variables approach. |