Font Size: a A A

Identification Of Hepatocellular Carcinoma Hub Genes And Construction Of Prognostic Models Based On Weighted Gene Co-Expression Network

Posted on:2022-10-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:H SongFull Text:PDF
GTID:1524306902477334Subject:Imaging and nuclear medicine
Abstract/Summary:PDF Full Text Request
Part 1 Identification of hub genes in hepatocellular carcinoma by weighted gene co-expression network analysisObjective:The high heterogeneity of hepatocellular carcinoma(HCC)limits the therapeutic effect.Therefore,there is an urgent need to discover new biomarkers to refine the subtype classification of HCC.The purpose of this study is to screen the hub genes that affect the pathological process of HCC through multiple data sets and based on the global network,then to find highly efficient HCC biomarkers.Method:The robust rank aggregation(RRA)method was used to integrate and analyze nine high-quality HCC data sets in the GEO(Gene Expression Omnibus)database,and screen a set of differentially expressed genes between HCC and normal liver tissue samples,then Weighted Gene Co-expression Network Analysis(WGCNA)was applied to cluster this group of genes and form multiple gene modules.Among them,the gene set that is highly related to clinical characteristics were considered as the key modules.The potential functions of these key modules were explored through functional enrichment analysis,and then combined with protein-protein interaction network(PPI)analysis to further explore the hub genes in the key modules.Finally,TCGA-LIHC was used as an independent data set to verify the expression difference of hub gene in tumor and paracancerous tissues,and to explore the correlation between hub gene expression and clinical stage and overall survival.Results:After the analysis of RRA combined with WGCNA,it was found that two key modules showed significant correlation with clinical characteristics.Combined with the PPI network,a total of 28 hub genes were identified.Among these genes,19 genes in the brown module showed up-regulation in HCC and were positively correlated with tumor(TNM)staging.The 9 genes in the turquoise module showed the opposite trend.Survival analysis showed that all hub genes were significantly related to the patient’s overall survival time.Notably,four of the novel risk genes,DAO,PCK2,SLC27A2 and HAO1 have rarely been reported in previous studies on HCC.Conclusion:The combined analysis of multiple bioinformatics of RRA,WGCNA and PPI provides a new method for revealing the complex biological mechanism of HCC,and helps to identify potential biomarkers of HCC.Part two Construction and verification of the risk prediction model based on hub genes for hepatocellular carcinomaObjective:The 28 HCC hub genes identified in the first part of the study have obvious correlations with differential expression,tumor stage and overall survival time.Based on the result,we aim to build a survival prediction model with strong predictive ability.Methods:We obtained mRNA expression profiles and corresponding clinical data of 341 HCC patients from the Cancer Genome Atlas(TCGA),and stratify these data according to TNM clinical stages,then divided them into training set and internal test set at a ratio of 7:3.LASSO-Cox regression was used to further screen 28 hub genes and construct survival prediction models by the training set,which was verified in the internal test set.To verify the reproducibility of the model,we obtained 229 HCC transcriptome data from the LIRI-JP data set of the International Cancer Genome Consortium(ICGC)database as an external independent data set to further verify the model.The model formula was applied to calculate the risk score of each sample.The training set,internal test set and external validation set were used to distinguish high and low risk groups by their respective median risk scores.Survival analysis and time-dependent ROC curve are made to show the results of model validation.Results:Five hub genes were screened as predictors by LASSO-Cox regression,namely KIF20A,CDC20,MCM2,CCNB2,and EHHADH,and the coefficients were 0.2315,0.2543,0.2151,0.2488,and-0.0185,respectively.Each high-risk group had a poor prognosis(internal test set:HR=13,95%CI=3.9-44,p<0.001;ICGC-LIRI:HR=1.2,95%CI=1.1-1.2,p<0.001)in both validation sets.And the best area under the curve(AUC)of the prediction model was 0.85 and 0.81 in the internal test set and the external validation set,respectively.Conclusion:The prognostic model constructed with KIF20A,CDC20,MCM2,CCNB2 and EHHADH can identify patients with a higher risk of death from HCC.The predictive performance of the model has been verified both internally and in independent data sets.The third part Verification by proteomics and preliminary application in clinical practice for the prognostic predictive model of 5 Hub genesObjective:The predictive model of 5 Hub genes was further verified in the proteomics data set and the clinical HCC postoperative cohort,and the preliminary research for the clinical application of the model was made.Methods:We obtained protein expression sequencing data and prognostic information of tumor tissues and paired adjacent tissues of 159 HCC patients from the clinical proteomic tumor analysis consortium(CPTAC).At the same time,156 cases of HCC postoperative pathological tissue and adjacent tissue sections were obtained from our hospital.The expression of KIF20A,CDC20,MCM2,CCNB2 and EHHADH in each sample was detected by immunohistochemical method.The IHC-profile plug-in in ImageJ software was used to semi-quantitatively analyze the immunohistochemistry results.After obtaining the expression information of 5 proteins in the CPCTA-HCC cohort and the clinical sample cohort,the prognostic prediction model(Risk Score=KIF20A(expression)×0.2315+CDC20(expression)×0.2543+MCM2(expression)×0.2152+CCNB2(expression)×0.2488EHHADH(expression)×0.0185)of the previous part was used to verify the two cohorts separately.According to the median risk score of each cohort,each cohort sample was divided into high and low risk groups.Kaplan-Meier survival analysis and time-dependent ROC curve were used to judge the predictive performance of the model in the two protein data cohorts.At the same time,the differences in the expression of these 5 proteins in tumor tissues and adjacent tissues were tested.At the same time,the C index and decision curve analysis were used to compare the predictive power of the prognostic model and other clinical parameters.Results:In the CTPAC-HCC cohort:KIF20A,CDC20,MCM2 were up-regulated in tumor tissues,and EHHADH was down-regulated in tumor tissues.Survival analysis showed that the high-risk group had a worse prognosis(HR=1.4,95%CI:1.2-1.7,p<0.001).The time-dependent ROC curve showed that the AUC values of 1-5 years were 0.687,0.662,0.639,0.716 and 0.716,respectively.In the clinical HCC sample cohort,the expression of KIF20A,CDC20,MCM2 and CCNB2 were up-regulated in tumor tissues,while the expression of EHHADH in tumor and paracancerous tissues had no significant difference.The survival analysis of this model in the clinical cohort showed that the high-risk group had a poor prognosis(HR=1.1,95%CI:1.1-1.1,p<0.001).The time-dependent ROC curve showed 1,3,and 5 years,and the AUC value are 0.894,0.82 and 0.864,respectively.And the predictive power of the risk score was better than that of a single gene.In addition,the C-index curve and clinical decision curve showed that the predictive power of this model was better than serum AFP levels and China liver cancer staging(CNLC)and the riskscore combined with serum AFP levels and clinical stage(CNLC)had a better predictive effect.Conclusion:1.In HCC tumor tissues,the expression of KIF20A,CDC20,MCM2,and CCNB2 protein is higher than that of normal liver tissue;2.The high expression of KIF20A,CDC20,MCM2,CCNB2 in HCC tissues indicates a poor prognosis;3.KIF20A,CDC20,The HCC prognosis model composed of MCM2,CCNB2 and EHHAHD can be applied to protein expression data;4.This model combined with serum AFP level and clinical stage(CNLC)may have a better predictive effect;5.This prognostic model is helpful for subtype classification of early HCC.
Keywords/Search Tags:Hepatocellular carcinoma, hub genes, WGCNA, Prognosis, LASSO, Biomarkers, Model validation, Clinical transformation
PDF Full Text Request
Related items