| Background:Worldwide,lung cancer is the most frequently diagnosed cancer and the leading cause of cancer-related deaths.Lung adenocarcinoma(LAC)is the most diagnosed histological subtype of lung cancer.Since patients generally remain asymptomatic until a late stage,symptomatic patients miss the optimal opportunity for treatment.Severally,the main detection method and the diagnosis for NSCLC are diagnostic imaging and tissue biopsy,which are not widly used for early diagnosis currently.The next-generation sequencing technology has provided high-throughput information and new methods for researching the molecular characterization,diagnostic markers and therapeutic targets.The current study aimed to identify high-risk LAC patients in order to develop more appropriate treatment and follow-up strategies to improve the overall prognosis,and to discover novel biomarkers and molecular mechanisms of LAC to aid in its diagnosis,prognosis,prediction,disease monitoring,and emerging therapies.Methods:Gene expression data and clinical data for patients with LAC were obtained from TCGA,including 498 LAC samples.Low-expressed genes were filtered and duplicates were removed.A total of 12,914 protein-coding genes were retained.Based on pathologic TNM stage,498 samples were divided into a training set and a validation set by stratified randomization,in a ratio of 7:3.The training set comprised 348 samples and the validation set comprised 150 samples.A total of 123 samples from patients in the training set who completed the follow-up including 12,914 protein-coding genes and clinical information were obtained for weighted gene co-expression network analysis(WGCNA).WGCNA is a free-scale network construction method suitable for dividing highly correlated genes into modules and relating these modules to prognosis of LAC.We futher analyzed the module which showed the strongest negative correlation with survival time to explore the functional significance of the identified genes and detect its hub genes.By utilizing 348 samples in the training set,LASSO Cox regression models were constructed and hub genes from the selected modules that were related to overall survival were selected.Risk scores,calculated by the linear combination of each gene expression multiplied by the LASSO coefficient,were assess by receiver operating characteristic(ROC)curve,by which the cutoff was decided.The model was further validated by Kaplan-Meier analysis in the validation set.Results:The constructed weighted gene co-expression network included 42 modules,including 39-1,360 genes.One module was positively and six modules were negatively correlated with survival time.We further analyzed the darkred module,which showed the strongest negative correlation with survival time but was not considered significantly correlated to pathologic TNM stage.To explore the functional significance of the identified genes in the darkred module,113 genes were subjected to GO term and KEGG pathway enrichment analyses.Among the genes most negatively correlated to survival time,positive regulation of MAPK cascade,regulation of MAPK cascade and Toll-like Receptor Cascades were the most significantly enriched in GO term and KEGG pathway enrichment analyses.Twenty genes were defined as hub genes in the dark red module.The LASSO Cox regression model was constructed by utilizing several hub genes in the darkred modules.Based on four chosen genes(OPN3,GALNT2,FAM83A,and KYNU),the risk score of every patient was calculated according to the linear combination of each gene expression multiply LASSO coefficients:(0.0004x OPN3)+(0.0042×GALNT2)+(0.0055x FAM83A)+(0.0077×KYNU).The model could successfully discriminate between LAC patients with low and high OS by Kaplan-Meier analysis in the validation set.The risk score was an independent predictor for overall survival.Conclusion:The current study constructed a weighted gene co-expression network of LAC by utilizing the RNA sequencing and clinical data for patients with LAC from TCGA.We further analyzed the module which showed the strongest negative correlation with survival time to explore the functional significance of the identified genes and detect its hub genes.LASSO Cox regression model was constructed and further validated further by Kaplan-Meier analysis in the validation set.This study revealed potential signaling pathways and gene co-expression networks for LAC and contributed to potential diagnostic and therapeutic strategies for LAC management and decision-making. |