Font Size: a A A

Screening Potential Biomarkers For Predicting The Occurrence And Prognosis Of Hepatocellular Carcinoma Based On Gene Set Variation Analysis

Posted on:2024-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z S LiangFull Text:PDF
GTID:2530307064987399Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective:In this study,we used gene set variation analysis(GSVA)to estimate the relative enrichment score of immunologic and hallmark genes sets,identifying prognosis-related subtypes of cirrhosis and hepatocellular carcinoma(HCC).By screening for the differentially expressed gene sets(DESs)between the subtypes,a clinically valuable prediction model was established.Furthermore,we identified biomarkers that potentially affect the occurrence and development of HCC,which provides new insights into the pathogenesis and targeted treatment of HCC.Methods:The gene expression data and clinical related data of this study were derived from various datasets including 216 patients with cirrhosis from the GSE15654 dataset,203 HCC patients with cirrhosis from the GSE14520 dataset,67 HCC patients treated with sorafenib from the GSE 109211 dataset,349 patients with HCC from the TCGA-LIHC dataset,and 231 HCC patients from the LIRI-JP dataset within the ICGC database.Among them,the GSE 15654 dataset contains information about the development of HCC in cirrhosis patients,the GSE14520 dataset contains survival information for HCC patients with cirrhosis,TCGA-LIHC and LIRI-JP datasets contain survival information for HCC patients,and GSE 109211 contains information on the sensitivity of HCC patients to sorafenib treatment.(1)Hallmark gene set and Immunologic Signature gene set were downloaded from MSigDB database.We used R package "GSVA" to calculate gene set scores for samples in GSE15654 and GSE14520 datasets,evaluating the overall activity changes of samples in known biological pathways such as metabolic pathways and immune infiltration.(2)Based on the gene set score,k-mean method was applied to cluster analysis for samples from GSE14520 and GSE15654."Silhouette" algorithm was used to determine the optimal number of clusters using R package "factoextra".(3)DESs between cirrhosis subtypes and between HCC subtypes were obtained using the "limma" R package.Univariate Cox regression analysis was conducted on the DESs between subtypes to identify the DESs related to the occurrence of HCC in cirrhosis patients and the DESs related to the survival of HCC patients with cirrhosis.The bile acid metabolism gene set was obtained by intersecting the two groups of DESs using the online Venn diagram mapping website(http://bioinformatics.psb.ugent.be/webtools/Venn/).(4)Univariate Cox regression was conducted on genes included in the bile acid metabolism gene set,with HCC occurrence as the outcome,to obtain genes related to HCC occurrence.Taking the cirrhosis patients in the GSE15654 as the training set,the model called BM_Score was established on the basis of the genes related to the occurrence of HCC by lasso-cox method using the R package "survival" and"glmnet".Next,we validated the predictive efficacy of BM_Score for survival of HCC patients with cirrhosis in the GSE 14520.In order to verify whether the predictive ability of BM_Score is universal for HCC patients,we further examined the predictive efficacy of BM_Score for survival of HCC patients in the TCGALIHC and LIRI-JP datasets,and the predictive ability of BM_Score for the treatment sensitivity of sorafenib in HCC patients in the GSE 109211 dataset.Kaplan-Meier(K-M)survival curve and receiver operating characteristic curve(ROC)were drawn to evaluate the predictive efficacy of BM_Score.(5)TIMER 2.0 website was used to evaluate the relationship between BM_score and immune cell infiltration.(6)All analyses in this study were based on R(version 4.2.1).Wilcoxon rank sum test and chi-square test were used to compare the differences between groups;log-rank test was used to compare survival differences.P<0.05 indicates a significant difference.Results:(1)Identification of subtypes in patients with cirrhosis and HCCBased on the gene set enrichment score,216 cirrhosis patients were clustered into two distinct subtypes(C1/C2).K-M curves showed that compared to C2,C1 patients with cirrhosis were prone to developing HCC,having liver decompensation,progression in Child-Pugh class,and death(P<0.05).Based on the gene set scores of the GSE14520 dataset,203 HCC patients with cirrhosis were clustered into two subtypes(H1/H2).K-M curves showed that HCC patients in H1 had a higher risk of death compared to H2(P<0.001).H1 HCC patients with cirrhosis were prone to having high alpha-fetoprotein levels,having tumor size greater than 5cm,having multiple nodules,tumor recurrence,and having high TNM,BCLC,and CLIP staging(P<0.05).(2)Identification of bile acid metabolism gene setThere are 2211 DESs between the subtypes of cirrhosis.With the occurrence of HCC as the outcome variable,the univariate Cox regression analysis was performed on the 2211 DESs,identifying 52 DESs related to the occurrence of HCC.In addition,there are 635 DESs between the HCC subtypes,with the survival status as the outcome variable,univariate Cox regression analysis was performed on the 635 DESs,identifying 92 DESs related to the survival of HCC patients with cirrhosis.The bile acid metabolism gene set was obtained by intersecting the two groups of DESs.(3)Construction and verification of BM_scoreUnivariate Cox regression analysis was performed on 112 genes in the bile acid metabolism gene set,and 15 genes were identified as being associated with HCC occurrence.Lasso-cox regression analysis was performed on these 15 genes to obtain the optimal prediction model called BM_Score.For the ability of BM_Score to predict the development of cirrhosis into HCC,the area under curves(AUC)of the time-dependent ROC curves were 0.790,0.743,and 0.797 at 5,7,and 10 years.In GSE14520,the AUCs for predicting survival of HCC patients with cirrhosis at 1,3 and 5 years were 0.667,0.692,and 0.610.In LIRI-JP,the AUCs for predicting survival of HCC patients at 1,3 and 5 years were 0.767,0.651,and 0.355.In TCGALIHC,the AUCs for predicting survival of HCC patients at 1,3 and 5 years were 0.641,0.602,and 0.567.In GSE109211,the AUC for predicting treatment sensitivity of sorafenib in HCC patients was 0.876.(4)TTR,AR,AQP9,and FDXR are expected to be potential biomarkers of HCCWe have identified five genes(AQP9,AR,FDXR,TFCP2L1,and TTR)involved in the construction of the BM_Score model.Among them,TTR,AR,and AQP9 were significantly underexpressed in tumor samples compared to adjacent normal samples,while FDXR was significantly overexpressed in tumor samples.TTR,AR,AQP9,and FDXR are expected to be potential biomarkers for the occurrence and prognosis of HCC.Conclusions:(1)The bile acid metabolism is involved in the occurrence and development of HCC.(2)The predictive model constructed by TTR,AR,AQP9,TFCP2L1,and FDXR has good predictive performance for the prognosis of patients with cirrhosis and HCC.(3)TTR,AR,AQP9 and FDXR have the potential to serve as biological markers for the prognosis of patients with cirrhosis and HCC.
Keywords/Search Tags:Hepatocellular carcinoma, cirrhosis, GSVA analysis, immune cell infiltration, tumor microenvironment
PDF Full Text Request
Related items