Research And Application Of Propensity Score Method Based On Machine Learning Metho

Posted on:2024-08-15

Degree:Master

Type:Thesis

Country:China

Candidate:W Y Zhang

Full Text:PDF

GTID:2557306920973789

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence,causal inference has become one of the core issues of statistics and data science.People have begun to pay attention to how to launch a causal relationship conclusion when a phenomenon has occurred.Compared with machine learning,causal inference has attracted attention for its unique interpretability.The application of propensity score in observational research has always been the focus of causal inference research,but how to accurately estimate the propensity score value has always been hindered by problems such as covariates and uncertain functional relationships between covariates and treatment.At present,some studies have proposed that the introduction of nonparametric machine learning methods in the process of estimating propensity score may be a means to achieve accurate estimation of propensity score.Based on the standardized mortality ratio weighting,this paper proposes to optimize the propensity score model by using machine learning.The core of the idea is to replace the traditional logistic regression estimation propensity score weight through machine learning,and further eliminate the difference in covariate distribution between treatment group and control group.Therefore,this paper first examines the performance of various propensity score models using simulated data.According to the degree of linearity and additive correlation between covariates and treatment,this paper estimates the propensity score weighting by logistic regression,support vector machine model,neural network,random forest and generalized boosted models(GBM)in four cases.The results show that logistic regression performs poorly under the conditions of non-additive and nonlinear,while the ensemble learning may be more useful for covariate balance.Finally,the real data of the SEER database is used as the experimental background,five propensity score models are empirically analyzed.Based on the propensity score model,the effect of cancer-directed surgery(CDS)on the survival rate of patients with oligometastatic pancreatic ductal adenocarcinoma(PDAC)is evaluated,and the causal inference process and results of different models are compared.By balancing the covariates and estimating the average effect of the treatment group,it is confirmed that CDS can effectively prolong the overall survival of patients with oligometastatic PDAC.Empirical analysis shows that covariate balance and robustness of neural network,random forest and GBM model are better than logistic regression model.

Keywords/Search Tags:

Causal inference, Propensity score, Machine learning, Standardized mortality ratio weighting

PDF Full Text Request

Related items

1	Causal Inference Of Multivariate Treatment Variables Based On Generalized Propensity Score
2	Causal Inference Considering Neighborhood Treatment Under Network Observation Data
3	The Effect Of Number Of Siblings And Birth Order On The Development Of Junior High School Students
4	Research On Matching Interpolation Model Inference Based On Dimension Reduction Of Propensity Score
5	Artificial Intelligence And Labor Share Of Income
6	Research On Population Mortality Based On Machine Learning Methods
7	The Causal Inference Models For The Survival Studies Under A Randomized And Broken Randomized Clinical Trials
8	Feature Weighting Method For Binary Classification In Machine Learning
9	Physical activity and osteoporotic fracture risk in older men: An application of causal inference methods with observational data
10	Comparative Study On The Prediction Of Under Five Mortality Rate In China Based On Machine Learning Model