Font Size: a A A

Bioinformatics Analysis Of Prognosis Related Genes And Expression Of TOP2A In Non-small Cell Lung Cancer

Posted on:2020-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:B WangFull Text:PDF
GTID:2404330590455880Subject:Pathology and pathophysiology
Abstract/Summary:PDF Full Text Request
Objective:Lung cancer is one of the major malignant tumors in the world,and its morbidity and mortality are the highest in the world.Among them,80%~85% are non-small cell lung cancer(NSCLC),including lung adenocarcinoma,squamous cell carcinoma and large cells.Three subtypes of cancer.Although immunotherapy,molecular targeted therapy and other means have emerged,it has achieved good clinical results.However,the number of clinically available molecular targets is limited,and only a few genes such as EGFR,ALK,ROS1,BRAF,RET and C-MET are available.And mainly concentrated in lung adenocarcinoma.For other histological subtypes of NSCLC,there are still few studies on tumor-related genes and targeted therapies.For example,in lung squamous cell carcinoma,there are only a few genes such as FGFR1 and DDR2,and most of the related drugs are still in clinical trials,and the clinical effects remain to be further tested.For large cell lung cancer,no effective molecular targets have been found.New molecular markers and effective molecular therapeutic targets are important for early diagnosis of NSCLC,prognosis and early intervention,and continuous control of NSCLC development and improvement of clinical treatment of NSCLC patients.With the emergence of massive biological data derived from high-throughput detection such as second-generation sequencing and protein chips,new and effective molecular targets can be well screened.Currently,publicly available databases include GEO,TCGA,STRING,GEPIA,and KM-Plotter,Oncomine etc,this topic mainly through the analysis of bioinformatics big data,aiming at:1.Bioinformatics analysis of genes differentially expressed between NSCLC and normal lung tissues,as well as screening for genes related to prognosis,analysis ofbiological functions,cell localization,and involved signaling pathways,screening for key genes in the development of NSCLC.2.Select the core genes(the most closely related to other genes)in the above key genes,such as TOP2 A,study the relationship between them and the clinicopathological features of NSCLC patients,and analyze the specific molecular mechanism of TOP2 A affecting the development of NSCLC.Its potential as a novel prognostic biomarker for NSCLC and the potential for gene therapy targets.Methods:1.Bioinformatics analysis of genes related to NSCLC development(1)Data screening of differentially expressed genes in MSCLC compared with normal tissues(1)GSE44077,GSE18842 and GSE33532 three gene/m RNA expression profile datasets were selected from the Gene Expression Omnibus(GEO)database and analyzed by GEO2 R online analysis tool to analyze differentially expressed genes(DEGs)between NSCLC and normal lung tissues,adjusted P value <0.05,|log2(fold-change)|?1as the standard of the cutoff value;(2)Using Venny2.1.0 to cross-over and down-regulate the differentially expressed genes,respectively,to screen out the genes that are differentially expressed;(2)Functional analysis of differentially expressed genes in NSCLC(1)Using Fun Rich 3.1.0 to analyze the location of the major differentially expressed genes,the molecular function and the participating biological processes;(2)Analysis of molecular signaling pathways in which major differentially expressed genes are predominantly enriched.(3)Screening of core genes and their functional enrichment(1)Protein-protein interaction network(PPI)for differentially expressed genes,and15 central genes with high correlation(degree of relationship)with surrounding geneswere selected using Cytoscape 3.6.0;(2)Kaplan-Meier plotter was used to analyze the prognostic significance of central gene in patients with NSCLC,and the genes with direct difference in survival prognosis of NSCLC patients were selected as the core genes;(3)A functional module analysis of common differentially expressed genes using a search tool for searching for interacting genes/proteins(STRING)to illustrate which biological pathways the core genes play a role in.2.The expression and clinical significance of the selected core gene TOP2 A in NSCLC(1)Bioinformatics analysis of the relationship between TOP2 A and clinicopathological features(1)Analysis of TOP2 A expression in different tumor types using the Oncomine database.(2)Verify the analysis results of the Oncomine database using the GEPIA database.(3)Analysis of the expression level of TOP2 A in NSCLC(lung adenocarcinoma and squamous cell carcinoma)using the Human Protein Expression Profile(HPA)database.(4)Login to the Cancer Genome Atlas(TCGA)website to download clinical pathological data of patients with NSCLC(including 972 clinical patient information,484 lung adenocarcinoma and 489 lung squamous cell carcinoma tissue specimens),and analyze the relationship between TOP2 A and clinical pathological features of NSCLC patients;(2)Verification of the relationship between TOP2 A and clinicopathological featuresThe NSCLC samples of the Department of Pathology and Biological Samples of the Second Hospital of Shanxi Medical University and the corresponding clinicopathological data(107 clinical patient information,61 cases of lung adenocarcinoma and 46 cases of lung squamous cell carcinoma tissue specimens)were collected to prepare tissue chipsfor TOP2 A.IHC was used to analyze the relationship between TOP2 A and clinicopathological features of patients with local NSCLC,and to compare the differences with “big data”.(3)Initial exploration of the molecular mechanism of TOP2 A affecting the development of NSCLC(1)Based on TOP2 A,draw a PPI network in STRING,and analyze the main molecular signaling pathways involved in TOP2 A and TOP2A-related proteins;(2)The Oncomine and STRING databases were used to first screen for genes with the same trend as TOP2 A expression,and then use GEPIA to analyze the specific expression correlation(positive correlation,negative correlation,linear correlation or curve correlation)between TOP2 A and co-expressed genes.Results:1.Bioinformatics analysis of prognosis related genes in NSCLC(1)Screening and functional analysis of common differentially expressed genes(1)In the three GEO data sets GSE44077,GSE18842 and GSE33532,we screened1133,4459 and 3775 DEGs,respectively,including 691,2505,2351 down-regulated genes and 442,1954,1424 up-regulated genes;(2)Venny 2.1.0 for the intersection of the three data sets shows that there are 232up-regulated DEGs and 432 down-regulated DEGs in all DEGs.A total of 664 DEGs coexist in the three data sets;suggesting the expression of 664 DEGs Credibility;(3)Functional analysis of 664 common DEGs,GO analysis showed 232 up-regulated genes,mainly related to cell cycle and mitosis,432 down-regulated genes were mainly enriched in cell communication,signal transduction,cell adhesion and plasma membrane;KEGG analysis showed that up-regulated DEGs are mainly involved in cell cycle/mitosis and DNA replication,and down-regulated DEGs are mainly involved inhemostasis,cell surface interactions on the vessel wall,and epithelial-mesenchymal transition(EMT).(2)Determination and functional analysis of core genes(1)Cytoscape 3.6.0 plug-in cytohubba0.1 calculates the PPI network of common differentially expressed genes,according to the degree method(degree,the degree is defined as the number of interconnections between the node and other nodes in the network,ie the number of adjacent proteins,this is the most basic parameter of the node)to screen out the central genes,including TOP2 A,CDK1,CCNB1,CCKN3,NDC80,BUB1,CCNA2,MELK,CDC20,KIF23,PBK,KIAA0101,AURKA,CCNB2 and MAD2L1;(2)Kaplan Meier plotter's total survival analysis of 15 genes showed that 14 were significantly associated with overall patient survival(P<0.05),including TOP2 A,CCNB1,CCKN3,NDC80,BUB1,CCNA2,MELK,CDC20,KIF23,PBK.,KIAA0101,AURKA,CCNB2 and MAD2L1,further analysis of these 14 genes as core genes,the results suggest that the high expression of 14 core genes is related to the worse OS of NSCLC;(3)Analyze the functional modules of DEGs through the MCODE plug-in in cytoscape3.6.0 software,and sort them in descending order based on gene counting.The FDR value is also much less than 0.05.The results show that 664 common DEGs are mainly enriched in cell cycle and oocyte reduction.Several divisions,PI3K-Akt signaling,and progesterone-mediated signaling pathways such as oocyte maturation,especially in the cell cycle.2.Expression and clinical significance of TOP2 A in NSCLC tissues(1)Analysis of the relationship between TOP2 A and clinicopathological features(1)The Oncomine database was used to analyze the expression level of TOP2 A in human tumors.The results showed that TOP2 A was overexpressed in most human tumors,including solid tumors of the lungs and malignant tumors of the blood system.The GEPIA database also verified This conclusion.In addition,the expression of TOP2 A is also consistently elevated in different lung cancer subtypes;(2)Using HPA database,TOP2 A histochemical antibody(No: HPA006458)was used to analyze the immunohistochemistry results of normal lung tissue and NSCLC tissues.The results showed that normal lung tissue TOP2 A protein was low or not expressed,but in NSCLC.Strong yang coloration of the nucleus;(3)Analysis of clinical pathology data from TCGA,the expression of TOP2 A was only related to the age of the patient(P=0.035)and ethnicity(P=0.018),however,it was not associated with the histological type(adenocarcinoma and squamous cell carcinoma),gender of NSCLC patients,staging,location,smoking,and survival time;(2)Verification of the relationship between TOP2 A and clinicopathological featuresTo analyze the clinicopathological data of TOP2 A and local NSCLC patients,the expression of TOP2 A was associated with histological subtypes(adenocarcinoma and squamous cell carcinoma)(P=0.000),gender(P=0.001)and smoking(P=0.006)in NSCLC patients.It has no significant with the age,stage,degree of differentiation,location,size,invasion of the membrane,invasion of the bronchial end,and lymph node metastasis of NSCLC patients,and there is a difference between TCGA and TCGA data.(3)Preliminary exploration of the molecular mechanism involved in TOP2 A in NSCLC(1)The PPI network centered on TOP2 A and its enrichment analysis showed that its KEGG and Biological Process(BP)were mainly enriched in cell cycle and mitosis;(2)The co-expression analysis of TOP2 A in Oncomine database found that TOP2 A and TPX2 have strong correlation;the PPI network centered on TOP2 A is deduced by co-expression,and it can be concluded that TOP2 A and TPX2 have strong correlation.It is 0.993,which is consistent with the results of Oncomine.After Gearson's Pearson analysis,we found that TOP2 A and TPX2 have a positive correlation with a P value of less than 0.05 and R = 0.57.Conclusion:1.Using bioinformatics analysis methods,we revealed that 14 core genes may serve as a new prognostic biomarker for NSCLC and a viable gene target for NSCLC development,with TOP2 A having the strongest degree;2.Compared with normal lung tissue,NSCLC has a significantly higher expression level of TOP2 A and is associated with the prognosis of patients;3.TOP2 A may affect the development of tumors by regulating the cell cycle,and TOP2 A and TPX2 may play a synergistic role in the occurrence and development of NSCLC.
Keywords/Search Tags:non-small cell lung cancer(NSCLC), different expressed genes(DEGs), TOP2A, protein-protein interaction network(PPI), biomarker
PDF Full Text Request
Related items