Font Size: a A A

Research On Tumor Neoantigen Prediction Method Based On Deep Learning

Posted on:2023-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:S H LiFull Text:PDF
GTID:2544307061954709Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Cancer is a hidden danger to people’s life and health.Immunotherapy has opened a new era in cancer treatment.Neoantigens are tumor-specific antigens derived from non-synonymous mutations,among which neoantigen-based tumor vaccines and adoptive cell therapy have been successfully applied in clinical practice to prove that tumor neoantigens are ideal targets.However,the cost of neoantigen testing is huge,so it is necessary to use neoantigen prediction to promote further research.At present,although some neoantigen prediction algorithms represented byNetMHCpan series prediction algorithms have achieved certain success in tumor neoantigen research,they still face the problem of low recall or low precision.In addition,most of the existing antigen presentation prediction models are based on monoallelic maps,which cannot well reflect the complexity of the role of peptides in vivo,and the available multi-allelic antigen presentation prediction algorithms cannot handle this complexity well,there is an urgent need to develop new multiallelic antigen presentation prediction algorithms to address the complexity of the system.On the one hand,with the in-depth study of genomics and proteomics,a large number of protein mass spectrometry data have been gradually accumulated.On the other hand,the rapid development of deep learning technology provides more theories and tools for the prediction of neoantigens.In view of the characteristics of antigen presentation,this paper develops tumor neoantigen prediction model by absorbing the latest research results of deep learning.The specific research contents are as follows:(1)The monoallelic class Ⅰ major histocompatibility complex(MHC-Ⅰ)antigen presentation prediction model STMHCPan was established based on Star-Transformer algorithm.Since Star-Transformer models have demonstrated improved performance in small and medium data sets,the use of this model for prediction has significant advantages given the small size of the neoantigen data set.We studied four deep learning algorithms,Transformer Encoder,Star-Transformer,Star-Transformer-CNN and Star-Transformer-LSTM,and established MHC-Ⅰ antigen presentation prediction models respectively.Evaluated on independent test sets collected in the latest version ofNetMHCpan4.1,Star-Transformer achieved excellent results compared with other models and methods such asNetMHCpan4.1.Further,we train STMHCPan in large MHC-Ⅰ datasets collected by IEDB,Compared with the latestNetMHCpan4.1,STMHCPan showed excellent performance in IEDB presentation dataset,external test dataset,T cell responsive epitope dataset and T cell responsive neoantigen dataset.The STMHCPan model was validated by visualizing the predicted peptide motif,visualizing the attention weight at the model attention level and comparing the differences.When the trained antigen presentation prediction model was transferred to the neoantigen prediction model,the false positive of the prediction result was reduced effectively.(2)Considering the multiallelic complexity of the antigen presentation environment,we established ResMAHPan model for prediction of HLA class Ⅰ antigen presentation based on ResNet.We utilized the excellent feature extraction capability of ResNetmodel,proposed a new peptide-multi-allele encoding method,and adopted a new attention mechanism to establish ResMAHPan model,which was compared with MHCflurry2.0 andNetMHCpan4.0 in MHCflurry2.0 independent test set.The results show that our method has better comprehensive prediction performance and can predict MHC-Ⅰ antigen presentation efficiently and accurately.We applied ResMAHPan to the MHC-Ⅱ antigen presentation prediction task,and ResMAHPan achieves results comparable to MARIA in the MARIA benchmark test set.Finally,we also compare the performance differences between STMHCPan and ResMAHPan under different positive and negative sample ratios.The results show that ResMAHPan has higher accuracy and better comprehensive performance in reality.(3)A web interface was developed to deploy STMHCPan and ResMAHPan models and analyze the metastatic melanoma cohort with this web interface.STMHCPan and ResMAHPan identified potential shared neoantigens.Finally,a neoantigen prediction pipeline was proposed for neoantigen prediction from whole exome sequencing/whole genome sequencing(WES/WGS)sequencing data,and the analysis of pancreatic ductal adenocarcinoma sequencing data showed the potential of the neoantigen prediction pipeline.
Keywords/Search Tags:Neoantigen, Deep learning, MHC-Ⅰ, MHC-Ⅱ, immunotherapy
PDF Full Text Request
Related items