Font Size: a A A

Quantitative Evaluation Of Risk Factors And Construction Of Risk Prediction Model For Gastric Cancer

Posted on:2021-03-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:F J DuanFull Text:PDF
GTID:1364330602970821Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Gastric cancer(GC)is the most common malignant tumor in digestive system ranking second of mortality in China.It was shown that a good prognosis could be obtained if radical resection was performed in early stage of GC with up to 90%5-year survival.However,less than 10%of patients are detected in GC early stage,and the 5-year survival for those with advanced disease below 30%.Thus more effective,non-invasive screening methods for early GC are still urgently needed.Multiple elements including Helicobacter pylori(H.pylori)infection,environment,genetics are implicated in GC tumorigenesis and progression.In addition to currently identified risk factors,H.pylori infection,smoking,drinking and high salt diet,the compound roles of single nucleotide polymorphisms(SNPs)and long noncoding RNAs(lncRNAs)have been being increasingly revealed.In view of the intrinsic complexity and heterogeneity,identifying potential risk factors and predicting the contribution of correlative factors based on appropriate models have been widely used in GC for early identification of high-risk populations,precise prevention,and individualized intervention.Nonetheless,lncRNAs,as a risk factor,have not been included in currently established risk prediction models and no polygenic risk scores(PRS)-based models have been founded in GC.ObjectiveTo clarify the impact of non-genetic factors such as H.pylori infection,environment,and genetic factors on GC incidence in Chinese population as well as their epidemiological significance relying on quantitative systematic evaluation and Meta-analysis;To construct GC risk prediction models based on the results of population verification association,and extract the optimal model by predictive efficacy evaluation and eventually provide the possible evidence for GC early diagnosis and accurate prevention in Chinese population.Methods1.Epidemiological evaluation of genetic and non-genetic factors and GC risk(1)A systematic literature searching was implemented using PubMed,EMBASE,Cochrane Library,Web of Science,CNKI(Chinese),Wanfang(Chinese),VIP(Chinese)and CBM(Chinese)database.Quantitative combined analysis was performed on studies that explored the correlation between biological,behavioral,environmental,and genetic susceptibility factors and GC incidence in Chinese population,and we used Venice criteria to evaluate the accumulated evidence.(2)The odds ratio(OR)and 95%confidence interval(95%CI)were used to analyze the correlation between non-genetic and genetic factors and gastric cancer.False positive reporting probability(FPRP)was used to evaluate the significant results,and the contribution to the risk of gastric cancer was evaluated by correlating significant non-genetic factors and gene combinations.(3)Genetic scores,attributable risk percentage(ARP),and population attributable risk Percentage(PARP)were used to evaluate epidemiological effects.2.Construction of gastric cancer risk prediction model based on PRS(1)The bioinformatics methods were used to screen lncRNAs and corresponding functional SNPs that are differentially expressed in GC and possess potential binding sites with microRNAs(miRNAs).Evidence-based medicine(EBM)strategy was carried out to screen the with genetic associations.The extracted SNPs were subjected to quality control in conjunction with potentially significant sites in Chinese population demonstrated by published associated systematic reviews and then were verified in population.(2)Based on frequency matched(1:1)case-control study design to match subjects according to gender and age(±2 years),the blood samples of 660 GC patients confirmed by pathology and 660 normal controls from.community were collected.Polymerase chain reaction restriction fragment length polymorphism(PCR-RFLP),created restriction site-PCR-RFLP(CRS-PCR-RFLP)and Improved Multiplex Ligation Detection Reaction(iMLDRTM)were used to genotype SNPs corresponding to lncRNAs or selected by EBM.(3)The Hardy-Weinberg equilibrium(HWE)test was performed on the genotype distribution of the control using Chi square test of goodness of fit.Unconditional Logistic regression was used to implement the correlation analysis between the selected SNPs and GC risk.(4)Plink method was used for quality control of related SNPs,association analysis of allele and generation of PRSice-2(Polygenic Risk Score software)basic dataset and target dataset.GC risk prediction models were constructed using EBM screened and association verified SNPs based on weighted genetic risk scores(wGRS)and PRS.lncRNAs SNPs were put into the prediction models as independent datasets of risk factors and empirical P-value was used to perform 10,000 fittings within the model to optimize model parameters and build the optimal model.(5)Receiver operating characteristic(ROC)and Area under curve(AUC)were used to evaluate the GC recognition degree of different models.Net reclassification improvament(NRI)and overall discrimination index(NRI)were used to evaluate the predictive ability of wGRS and PRS models;Akaike information criterion(AIC)and Bayesian information criterion(BIC)were used to evaluate the fitting degree of the model.Results1.H.pylori infection,smoking,drinking,family history,stomach problems,high-salt diets,pickled foods,fast eating,irregular diets,edible hot foods,smoked and fryed foods,spicy diets,depression and diabetes are associated with the GC risk(P<0.01).The trend of infection rate was basically consistent with GC incidence.2.PSCA rs2976392、rs2294008,MUC1 rs4072037,MTHFR rs1801133,COX-2 rs20417,XRCC1 rs1799782,rs25487,XRCC3 rs861539,NAT2 rs1799930、rs1799929,NAT2 Phenotype(Slow/Fast)、PLCE1 rs2274223、rs3765524,GSTM1,GSTT1,IL-17A rs2275913、rs8193036,PRKAA1 rs13361707,ERCC5 rs751402,TGFBR rs3773651,IL-10 rs1800896 and VDR rs731236 are potential genetic risk factors for GC(P<0.05).3.The cumulative frequency of the combined distribution risk of genetic and non-genetic factors in gastric cancer was highly correlated with the combined OR and genetic score,respectively.Both the OR value and genetic score corresponding to the cumulative frequency conform to the normal distribution after logarithmic transformation.For non-genetic factors,the top three ARPs are 66.33%(stomach disorders),54.34%(marinated foods)and 49.75%(smoked and fried).The top three PARPs were 33.85%(marinated food),24.73%(edible hot food),and 23.30%(H.pylori infection).For GC-associated SNPs,the top three ARPs were 53.91%(NAT2 rs1799929),53.05%(NAT2 phenotype),and 42.85%(IL-10 rs1800896).The top three for PARP were 36.96%(VDR rs731236),25.58%(TGFBR2 rs3773651),and 20.56%(MUC1 rs4072037).4.Based on the five genetic models of allele,heterozygous,homozygous,dominant and recessive genes,and adjusted by gender,age,smoking,drinking and family history of gastric cancer,the a multivariate logistic regression analysis was conducted which showed that 14 SNPs(rs 1859168,rs4784659,rs579501,rs77628730,rs7816475,rs6470502,rs1518338,rs2867837,rs12494960,rs7818137,rs3825071,rs7943779,rs911157,rs16981280)were found related to risk of GC in 21 associated lncRNA SNPs.Among the 20 EBM screened and association verified SNPs,15 SNPs(rs2294008,rs25487,rs751402,rs1801133,rs1799782,rs763780,rs8193036,rs4072037,rs2274223,rs2275913,rs20417,rs13361707,rs3773651,rs1799930,GSTM1)were statistically related to the risk of GC(P<0.05).5.wGRS was implemented in the distribution of lncRNA SNPs and EBM SNPs of in cases and controls,and the mean value of wGRS in cases was higher than that in controls.The IncRNA SNPs and EBM screened and association verified SNPs were grouped by wGRS value,with the 0-1 group as a reference.The GC risk increased significantly with the number of subgroups increased,and the distribution of PRS and wGRS was consistent with the results of grouping.6.The PRS based on EBM screened and association verified SNPs(genomic level)was divided into ten quantiles,with the 40-60%quantile as a reference.The results showed that as the quantile decreased,the overall individual risk decreased,and as the quantile increased,the GC risk increased significantly.The GC risk was 47%lower in the lowest 10%quantile than general population,and the OR and 95%CI were 0.53(0.34,0.83).The risk of GC was 3.24 times higher in the highest 10%quantile than general population(95%CI:2.07,5.06).7.NRI and IDI were used to estimate the improvement of model prediction efficacy with the addition of one or more new risk factors.The results showed that PRS model was better than wGRS model under the same factors or conditions,and lncRNAs SNPs related to GC as an independent data set could effectively improve the recognition of the model.According to the ROC curves of models included different risk factors and the comparison of AIC and BIC based on wGRS and PRS models based on different risk factors,the model with best fitting degree was selected.The results showed that on the basis of PRS,lncRNA SNPs,smoking,drinking and H.pylori infection had the best goodness of fit and prediction ability(AUC:0.78(0.68,0.88),AIC=117.23,BIC=122.31).Conclusions1.There was a significant linear relationship between the cumulative frequency of the combined distribution of genetic and non-genetic factors and the risk of GC,which increased significantly with the reduction of population cumulative frequency.2.H.pylori infection,pickled foods and hot foods,the exposure of VDR rs731236,TGFBR2 rs3773651,and MUC1 rs4072037 contributed significantly to the occurrence of GC.3.GC-related common SNPs and lncRNA SNPs had a significant combined effect.Under the same factors or conditions,the PRS model was better than the wGRS model and the introduction of GC-related lncRNA SNPs could significantly improve the model’s recognition.4.The model based on PRS combined with lncRNA SNPs,smoking,drinking,and H.pylori infection had the best predictive ability on the risk of GC,contributing to distinguish high-risk groups from low-risk population.
Keywords/Search Tags:Gastric cancer, Incidence risk, Risk factor, Polygenic risk score, Risk prediction model
PDF Full Text Request
Related items