Font Size: a A A

A Comparative Study Of Different Propensity Scoring Methods Based On Simulation And Empirical Methods

Posted on:2021-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:X D HaoFull Text:PDF
GTID:2514306353469804Subject:Social Medicine and Health Management
Abstract/Summary:PDF Full Text Request
Objective:Comparing the operational difficulty of different propensity score methods,and comparing the ability of propensity score methods to balance confounding factors between two groups and estimate average treatment effect in different situations,so as to provide suggestions for the choice and application of propensity score methods.Methods:(1)Literature research method:Searching three Chinese databases of Wanfang,CNKI,Weipuand and PubMed English database to consult the literature related to propensity score,as well as related textbooks and treatises,and use the references listed at the back of the search literature to track the cited documents to obtain more related documents,so as to understand the current research and development process of propensity score,and determine research topics,and Study the concepts,principles,and implementation methods of the propensity score method and Monte Carlo method to lay the foundation for further realization of simulation research and empirical research.(2)Simulation research method:In this study,Monte Carlo method is used for simulation research.According to the different characteristics between the ideal situation and the actual situation,the required data sets are simulated,and different propensity score methods are compared through these data sets.(3)Empirical research method:An empirical study was conducted using data from SEER database of patients diagnosed with early esophageal cancer and treated with endoscopic therapy or esophagectomy to compare the performance of confounding factors among the two groups using different propensity score methods,and to verify the results of the simulation study in a certain extent.(4)Data processing methods:Shapiro-Wick test was used for normality test,chi-square test,T test,rank-sum test,linear regression,Logistic regression were used for equilibrium test,Meta analysis was used to combine stratified results,and other data collation,chart making and treatment effects were calculated using Microsoft Excel 2013,SAS 9.4,R 3.6.1 statistical analysis software.Results:(1)Theoretical research results:Both Logistic regression and GBM models can be implemented by statistical software such as R,SAS,Stata.However,GBM model has higher requirements on software operation,parameter understanding and setting,and the operation difficulty is greater than Logistic regression.PSR is the simplest and easy to implement among the four methods;PSW requires less statistical software and is easier to implement;PSS requires specific statistical methods or statistical software,which is more complicated to implement than PSW and PSA;PSM requires the development of statistical software with corresponding matching functions,which is the most complicated of the four methods.(2)Simulation research results:A total of six data sets are constructed,each data set has seven covariates(four measurement data,three count data),one grouping variable,and one outcome variable.In the ideal situation and in the actual situation,each of the three data sets corresponds to a sample size of 500,1000,and 2000,and the true treatment effect in the two cases is 0.19204 and 0.19728,respectively.For accuracy,when model one does not omit important confounding factors,In the three sample sizes,the average treatment effect estimated by LR-PSW is closest to the true effect,and the average treatment effect estimated by GBM-PSW has the largest gap with the true effect.when model one omit important confounding factors,In the three sample sizes,the average treatment effect estimated by GBM-PSW is closest to the true effect,and the average treatment effect estimated by GBM-PSS has the largest gap with the true effect.when model two does not omit important confounding factors,In the three sample sizes,the average treatment effect estimated by GBM-PSW is closest to the true effect,and the average treatment effect estimated by LR-PSW has the largest gap with the true effect.when model two omit important confounding factors,In the three sample sizes,the average treatment effect estimated by GBM-PSW is closest to the true effect,and the average treatment effect estimated by GBM-PSS has the largest gap with the true effect.For precision,the Mean Square Error of several models is less than 0.005 in most cases,and less than 0.01 in some cases,the gap is small.(3)Empirical research results:In terms of equilibrium confounding factors,whether it is fitting a Logistic regression model or a GBM model,PSM can achieve equilibrium for all confounding factors,PSS has two confounding factors that are not at equilibrium,PSW has three confounding factors that are not at equilibrium when fitting a Logistic regression model and one when fitting a GBM model.For mean effect,the width of 95%CI of the OR calculated by GBM-PSW is the smallest(0.1 85);the width of 95%CI of the OR calculated by GBM-PSM is the largest(0.497).Conclusion:Different propensity score methods have different accuracy in estimating mean treatment effect in different situations.In actual research,there is no so-called optimal model,only the most suitable model in each case.An appropriate method should be selected for analysis according to different data characteristics and requirements.At the same time,the comparison of theoretical and simulation and empirical methods to the propensity score method can provide a certain reference for the users of the propensity score method,and it initially solves the problem of choosing the propensity score method in Epidemiological research.
Keywords/Search Tags:Monte Carlo, Simulation Research, Propensity Score, Empirical Research
PDF Full Text Request
Related items