Font Size: a A A

Research On Peptide And Major Histocompatibility Complex Class Ⅰ Molecules Binding Affinity Prediction Method

Posted on:2024-06-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:F X WangFull Text:PDF
GTID:1520307376983449Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Basic research and clinical application fields such as immunology,neurology,bioevolution,and biomedicine are inseparable from the study of peptide-MHC Cclass Ⅰ binding affinity prediction.Changes in the peptides-MHC Cclass Ⅰ molecules binding affinity can affect antigen presentation and recognition,thereby affecting the efficacy of tumor immunotherapy;Using the peptides-MHC Cclass Ⅰ molecules binding affinity to predict peptides of autoimmune diseases can realize the diagnosis of immune diseases and transplant rejection;Changes in peptides-MHC Cclass Ⅰ molecules binding affinity can also affect the function and activity of neurons and glial cells,helping to reveal the mechanisms of biological evolution and immune adaptation.At present,the accuracy of peptides-MHC Cclass Ⅰ molecules binding affinity prediction and the allelic binding sites prediction are still insufficient,which cannot meet the needs and restricts related research.This paper digs deep into the protein sequence features,physicochemical features and protein domain features,and conducts research on the method for predicting the affinity between peptides and MHC Cclass Ⅰ proteins.The main research contents are as follows:(1)Protein sequence features can provide structural and functional information for proteins,which plays an important role in peptide-MHC Cclass Ⅰ molecules binding affinity prediction.Based on this,this thesis studies peptide-MHC Cclass Ⅰ molecules binding affinity prediction method based on protein sequence embedding.This method builds a basic model based on the multi-head attention mechanism,captures the long-short distance relationship between protein tokens,learns the sequence features by self-supervision,and uses the sequence data of peptides and MHC Cclass Ⅰ molecules to build a peptide-MHC Cclass Ⅰ binding affinity prediction model.By focusing on the binding region between the peptide tokens and the MHC Cclass Ⅰ tokens,the binding features of the peptide and the MHC Cclass Ⅰ molecules are extracted,and the binding affinity is predicted.The experimental results on the immune epitope database(IEDB)validata the method improves accuracy.The analysis of the binding motifs of MHC Cclass Ⅰ molecules found that this method can accurately learn the token features of the motifs in related binding proteins,and accurately predict the binding sites of peptides and MHC Cclass Ⅰ molecules.(2)Protein physicochemical features play an important role in protein structure and function,different arrangements of different amino acids will affect the physicochemical features,and then affect the binding position of peptide-MHC Cclass Ⅰ binding.Based on this,this thesis studies peptide-MHC Cclass Ⅰ molecules binding affinity prediction method based on physicochemical embedding.By embedding 14 physical and chemical features,using the local attention mechanism of convolutional neural network to extract the physical and chemical features of peptides and MHC Cclass Ⅰ molecules.Combining with the sequence features of the binding region of peptides and MHC Cclass Ⅰ molecules,the binding affinity was predicted.The experimental results on the immune epitope database(IEDB)validata the method improves accuracy.Analysis of MHC Cclass Ⅰ molecules binding motifs found that this method effectively identifies the binding motifs and antigenic peptide tokens related to the physicochemical features.(3)Protein domain is the key functional unit of protein to exert biological utility,and different MHC Cclass Ⅰ protein domains play different roles when binding to peptides.Based on this,this thesis studies peptide-MHC Cclass Ⅰ molecules binding affinity prediction method based on protein domain embedding.This method builds a protein domain symbol set,uses multi-head attention to learn the features of peptide bonds and amino acid residues,and predicts the peptide-MHC Cclass Ⅰ molecules binding affinity.The experimental results on the immune epitope database(IEDB)validata the method improves accuracy.The analysis of MHC Cclass Ⅰ binding motifs found this method effectively identifies and predicts the binding sites of anchor residues of different MHC Cclass Ⅰ molecules and antigenic peptide tokens.Second and ninth amino acid on the peptide plays an important role in binding,and different MHC Cclass Ⅰ molecules use different anchoring strategies to bind to specific peptides.(4)In order to effectively integrate protein sequence features,protein physicochemical features and protein domain features,further improve the accuracy of the prediction method,this thesis studies peptide-MHC Cclass Ⅰ molecules binding affinity prediction method based on multi features embedding.This method effectively integrates protein sequence features,physicochemical features and domain features,builds an binding affinity prediction model based on the integrated features,and realizes peptide-MHC Cclass Ⅰ binding affinity prediction.The experimental results on the immune epitope database(IEDB)show that the multi features can further identify the amino acid binding sites of different antigenic peptides and improve the prediction ability.Through the analysis and comparison of various feature embedding methods,the features and applicability of various feature embedding methods are revealed.This paper focuses on the prediction of the binding affinity between peptides and MHC Cclass Ⅰ protein molecules,In-depth research on the method for eptides-MHC Cclass Ⅰ molecules binding affinity prediction based on protein features.Mining the sequence features and physicochemical features,and retaining the structural mutation and structural evolution features of protein domains through protein domain tokenization.Improving the accuracy of prediction of affinity between peptides and MHC Cclass Ⅰ molecules.The experimental results on the immune epitope database(IEDB)verify the effectiveness of the method.At the same time,through the analysis and comparison of various feature embedding methods,the alleles suitable for various feature embedding methods are found.
Keywords/Search Tags:Multi-head Attention, Self-supervied Learning, Feature Embedding, MHC Class I Molecule, Binding Affinity Prediction
PDF Full Text Request
Related items