| Objective:Using bioinformatics analysis theory and technical methods in conjunction with related gene databases,the differentially expressed breast cancer genes in the TCGA database were screened and statistically processed,and genes related to patient survival and prognosis were obtained and a multi-gene prognostic model was constructed based on this.Obtain the patient’s multi-gene model score and incorporate the patient’s clinicopathological information,construct a combined prognostic model,and extract independent prognostic factors from it to establish a breast cancer prognostic nomogram for clinical reference.Through GSEA enrichment analysis,related gene database query and literature search,it provides new ideas for finding possible new therapeutic targets and new prognostic markers for breast cancer patients.Methods: 1.Screened based on the TCGA database and finally included 1109 breast cancer tissues and 113 paracancerous tissues,download the gene expression data of the sample,and download the corresponding clinicopathological information of the sample,including the patient’s gender,age,ethnicity,and pathological classification,Overall survival days(overall survival,OS),survival status and other data.2.Use the "edge R" program package of the R language to perform differential analysis of breast cancer genes,and use corrected P<0.05 and differential expression multiples>2(FDR<0.05 and |log2FC|=1)as the screening criteria to obtain breast cancer Catalog of differentially expressed genes.3.Combine the patient’s clinical pathological data and differential gene expression data for analysis,use the R software package to perform single-factor COX regression analysis,perform LASSO regression analysis on the analysis results,and screen differential gene combinations by parameter Lambda value for subsequent analysis.According to the selected differential genes,a matrix of gene expression combined with clinicopathological data was constructed and a multi-factor COX regression analysis was performed.The stepwise method was further screened to construct a multi-gene prognosis model.Through the expression of the constituent genes in the multi-gene prognosis model and the multi-factor regression coefficients,a survival-related risk assessment model is constructed.4.cancer patients and clinicopathological information such as patient number,age(years),clinical stage(stage Ⅰ,stage Ⅱ,stage Ⅲ,stage Ⅳ),gender(male/female),total Survival days,survival status.Taking both the overall survival days and survival status as the dependent variables and the rest as independent variables,single-factor COX regression analysis was performed.The independent variables that meet the screening conditions in the analysis results are incorporated into the multivariate COX regression analysis model,and the stepwise method is used to screen the variables and determine the final independent prognostic factor,and establish a nomogram(Nomogram)based on the independent prognostic factor.5.Use GSEA-4.10 software to perform enrichment analysis on representative genes.At the same time,use GEPIA online analysis tool to perform correlation analysis on representative genes in the joint gene model,query and download pathological immunohistochemistry and cellular immunity of related genes through HPA database Fluorescence spectrum.Results: 1.The final result is(MMP13,DCTPP1,XG,DLG3,MAL2,PAICS,PEX5 L,PCDHGA2,BAMBI,AC011294.1,FIBCD1,AC026785.3,TH,BSND,ZPBP2,GVINP2,CBX1P3,SPRR4,AC093809.1,CCDC74BP1)A multi-gene prognostic model consisting of 20 genes.2.Combined with the patient’s clinical information,age and clinical pathological staging,a combined breast cancer prognosis model was established,and a nomogram was constructed based on independent prognostic factors for clinical reference.The statistically tested models have good simulation and prediction performance.3.Perform GSEA enrichment analysis on 20 differential genes,and finally get 10 gene enrichment results that meet the screening conditions.Get part of the gene total survival or disease-free survival analysis map through the GEPIA database,and get some parts through the HPA database The pathological immunohistochemical map of genes and the immunofluorescence staining map of cells.Conclusion: 1.Construct a combined gene model of breast cancer and construct a nomogram through independent prognostic factors for clinical reference.2.After a comprehensive analysis of the existing literature,this study shows that the nine genes DCTPP1,DLG3,FIBCD1,MAL2,MMP13,PAICS,XG,PCDHGA2,and GVINP2 may become new targets and new prognostic markers for breast cancer treatment.The clinical treatment and clinical prognosis assessment of breast cancer provide new ideas. |