| Objective:pancreatic cancer is a kind of malignant digestive system tumor with hidden incidence and poor prognosis.Accounting for 2% of all malignant tumors,the morbidity has increased significantly in our country,and the mortality rate has risen to the fifth place.Although the medical level has developed rapidly and the diagnosis and treatment of cancer have made great progress,the rate of early diagnosis is low,and most of them are found to be in the advanced stage,and the operation rate of pancreatic cancer is low,the scope of surgical resection is limited,and the postoperative recurrence rate is as high as80%.As a result,the 5-year survival rate of patients with pancreatic cancer is still very low.Bioinformatics is a cross-discipline formed by the combination of life science and computer science.In this study,bioinformatics methods were used to collect the results of high-throughput sequencing,analyze the differentially expressed genes in the pathogenesis of pancreatic cancer,and screen out the key differentially expressed genes in pancreatic cancer for in-depth study of the molecular mechanism of the occurrence and development of pancreatic cancer.Methods:the first step was to collect and download m RNA chips GSE16515,GSE28735 and GSE41368,from NCBI’s GEO gene expression database and use R language to normalize the data and identify the differentially expressed gene(DEGs),between pancreatic cancer tissue and normal tissue.The DEGs expression of the three data sets was reflected by volcanic map,and the three data sets were intersected by upset map.In the second step,the DEGs obtained by R language is used for GO(Gene Ontology)function enrichment analysis and KEGG(Kyoto Encyclopedia of Genes and Genomes)pathway enrichment analysis.The protein-protein interaction network PPI(protein-protein interaction)between DEGs was established by STRING database,and the PPI network was visualized by software Cytoscape.At the same time,the PPI network was analyzed by its software plug-in CytoHubba,and the key candidate genes were screened out.Then the key candidate genes were analyzed online by survExpressand the expression of these key genes was analyzed by GEPIA online.And the patient samples from the Affiliated Hospital of Qingdao University were used to verify the amount of gene expression and the verification of patient survival analysis.Results:taking P < 0.05 and |logFC| > 1 as the threshold,1874 DEGs were screened in GSE16515,of which 1630 genes were up-regulated,244 genes were down-regulated,and603 genes were down-regulated in GSE28735,including 362 genes up-regulated,241 genes down-regulated,1837 genes in GSE41368,including 1300 genes up-regulated and 537 genes down-regulated.The intersection of the three datasets includes 391 DEGs.The biological function of each DEGS was analyzed by GO enrichment analysis.The results showed that the biological process(BP)of DEGS significantly enriched the cell process and biological function.For example,these enrichment are mainly concentrated in extracellular matrix tissue and extracellular structure tissue.KEGG pathway analysis showed that these genes were significantly enriched in the pathways related to protein digestion and absorption,transcriptional regulation disorders in cancer and tumor pathways.Eight key genes(CP,CXCL10,EGF,ITGA2,KRT19,MMP1,MMP14 and PLAU)closely related to the survival of pancreatic cancer were identified by analyzing PPI network.The Kaplan-Miere diagrams of 8 key genes and the predictive models of 8key genes were established respectively.Pancreatic cancer patients were divided into high risk group and low risk group.The results showed that the survival rate of 8 key genes high risk group was worse than that of low risk group,and the combined analysis of 8genes showed that the survival rate of low risk group was better than that of high risk group.This prognostic marker has been further verified in patients with pancreatic cancer in the affiliated Hospital of Qingdao University.The overall survival rate(OS)of the high risk group was significantly lower than that of the low risk group.In TCGA database,the expression of CXCL10,CP,ITGA2,KRT19,MMP1,MMP14 and PLAU in pancreatic cancer tissues was higher than that in adjacent tissues,while the expression of EGF in pancreatic cancer tissues was lower than that in adjacent tissues.This result has been verified in our patient specimens.Conclusions:the gene signature identified by mRNA microarray can predict the prognosis of pancreatic cancer and guide the therapeutic targets of pancreatic cancer. |