| Internal ribosome entry site(IRES)is a segment of m RNA sequence capable of mediating cap-independent translation mechanism.Cancer cells are aided in developing resistance to stressful conditions,such as apoptosis,immune surveillance,chemotherapy,and radiotherapy,through IRES element-mediated cap-independent translation to maintain essential functions.Currently,a large number of undiscovered IRES elements exist in human m RNA,and the experimental identification of IRES elements is a complex and laborious process.With the advent of the era of big data,deep learning is being increasingly applied to the prediction of biological sequences.Therefore,building deep learning models for predicting IRES elements becomes a worthwhile approach.A large number of IRES elements are present in transcription factors,which are proteins that regulate cellular transcriptional processes and control various cellular processes including proliferation,apoptosis,differentiation,and inflammation.Dysregulation of transcription factors has been implicated in numerous human diseases,especially cancer.Targeting transcription factors has shown potent therapeutic effects against cancer,highlighting their significant potential for cancer treatment.Among all cancers,lung cancer is the most aggressive malignancy.Due to the lack of obvious early symptoms,lung cancer prognosis is extremely poor,with a 5-year survival rate of less than 20% for patients.Considering the significant heterogeneity of lung cancer in terms of clinical presentation,histopathology,treatment response,and risk of postoperative recurrence,analyzing the molecular features of cancer using high-throughput sequencing technology can identify reliable biomarkers for the diagnosis,prognosis,and treatment of lung cancer.Given the crucial roles of IRES elements and transcription factors in tumorigenesis and progression,we developed a new IRES element identification program based on deep learning to identify genes with IRES elements from transcription factors.Subsequently,we have screened for genes associated with lung cancer prognosis and constructed a prognostic model.Finally,we have assessed the prognostic status of lung cancer patients based on this model.The main work and results of this study are as follows:We have developed a new IRES element identification program called IREStest using an artificial intelligence approach.The training dataset of the program with positive and negative samples is 1209 sequences that have been experimentally validated.And for the first time,a deep learning-based model is used to construct the IRES component prediction program.The feature extraction module of the program consists of a bidirectional gated recurrent unit(Bi GRU).After adding the optimization of preheat learning strategy in Transformers,IREStest has 76.25% prediction accuracy and 97.50% sensitivity,as well as 68.42% precision and 80.41% F1 value on the test set.And the accuracy and F1 value reached 86.96% and86.96% in the independent test set.In contrast,the two existing eukaryotic IRES element prediction programs,IRESfinder and IRESPred,have accuracies of 65.00% and 52.50% on the test set,respectively.IREStest demonstrated superior performance in terms of accuracy,sensitivity,and F1 values.Compared with existing IRES element prediction programs,IREStest has stronger prediction performance.Considering the significant roles of IRES elements and transcription factors in tumorigenesis and progression,we aimed to develop a prognostic model based on IRES-transcription factor genes to predict the prognosis of lung cancer patients.762IRES-transcription factor genes that may contain IRES elements were screened by IREStest from 1617 transcription factors.Then,gene expression data and clinical information of 1019 and 462 lung cancer patients were obtained from TCGA and GEO,respectively,serving as the training and test sets.Then 216 genes expressed in both TCGA and GEO patients were obtained from 762 IRES-transcription factor genes and 4190 gene pairs were constructed,and genes associated with prognosis were screened by Cox regression analysis and LASSO regression.Eventually,17 IRES-transcription factor gene pairs were carefully selected to construct the prognostic model.Based on this model,1019 lung cancer patients from TCGA and 462 lung cancer patients from the GEO dataset were divided into high-risk and low-risk groups,and survival analysis demonstrated that the model based on IRES-transcription factor genes was able to assess the survival time of patients(p<0.001).In an independent prognostic analysis,it was shown that risk scores consisting of IRES-transcription factors were able to determine the prognosis of lung cancer patients as a prognostic factor independent of age,sex,grade and stage.Furthermore,we performed CIBERSORT analysis to explore immune cell differences between the high-risk and low-risk groups,and assessed differences between highand low-risk groups in terms of biological processes,molecular functions and cellular components by GO enrichment analysis.In this paper,we used deep learning to develop a new IRES element identification program,IREStest,to identify genes with IRES elements from transcription factors and further screen genes associated with lung cancer prognosis.We used Cox regression analysis and LASSO regression methods to finally select 17 IRES-transcription factor gene pairs to construct a prognostic model.This prognostic model-based approach can predict the prognosis of lung cancer patients,providing new insights and methods for personalized treatment of lung cancer.The results of this paper contribute to further investigate the molecular mechanisms of IRES elements and transcription factors that play a role in cancer,and provide a solid scientific foundation for cancer diagnosis and targeted therapy. |