Font Size: a A A

Flexibility-aware Overlapping Subgraph Clustering For Epitope Prediction

Posted on:2022-08-04Degree:MasterType:Thesis
Country:ChinaCandidate:C GaoFull Text:PDF
GTID:2480306536954449Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
An epitope of an antigen is composed of a sequence of specific and adjacent residues,which can be recognized by antibodies to activate immune responses.Epitope identification is of great practical significance to modern vaccine design,reagent preparation,and drug development.Experimental methods are accurate but time-consuming and costly.Therefore,it is more attractive to predict epitopes by using low-cost and high-performance computational methods.Although massive efforts are invested in computational models,there is big room for improvement.One of the possible limitations is the poor prediction performance for overlapping epitopes.As most of the existing methods mainly focus on single epitopes or multi-separated epitopes,they cannot detect accurately overlapping epitope prediction.Another possible limitation is that existing computational models identify epitopes based on static antigen data.Flexibility is an inherent property of protein that makes an antigen and an antibody bind dynamically.Unfortunately,it is not considered when the epitope prediction models are constructed.To this end,two models,SGLMEP(short for overlapping subgraph mining based on L-Metric for epitope prediction)and FAGEP(short for flexibility-aware overlapping subgraph clustering for epitope prediction),were proposed in this paper.Aiming at the overlapping epitope prediction,the first model SGLMEP were proposed.The model SGLMEP started from an atom-level graph based on surface atoms of an antigen.Subsequently,this atom-level graph was upgraded to a residue-level graph.The residue-level graph was later divided into nonoverlapping subgraphs,called seeds,via the Markov clustering algorithm.These seeds were further expanded by using the overlapping subgraph discovery algorithm.Finally,a classifier based on the graph convolutional network was constructed to classify the subgraphs into epitope or non-epitopes.SGLMEP was compared with the state-of-the-art models(Bepi Pred 2.0,SEPPA 3.0,Epi Pred and Glep).The experimental results show that SGLMEP outperforms the state-of-theart models,lifting the F1-score by 96.6%,123.8%,65.4% and 3.5%,respectively.Besides,the ablation experiment results show that it is effective to detect overlapping epitopes based on the local overlapping community detection.Conducting in-depth researches on SGLMEP,the second model FAGEP was proposed.Compared with SGLMEP,the improvements of FAGEP were in two aspects: one was to construct a denser atom-level graph combing with protein flexibility;another was to design a two-stage overlapping subgraph discovery algorithm based on the local overlapping community detection.FAGEP was compared with the state-of-the-art models(Bepi Pred 2.0,SEPPA 3.0,Epi Pred,Glep and SGLMEP).The experimental results show that the prediction performances of the model FAGEP have been significantly improved,achieving the F1-score of 0.656.In detail,the F1-score has been increased by 120.9%,151.3%,85.8%,16.3% and 12.3%,respectively.Further experimental results show that improvement strategies can greatly enhance the performances of epitope prediction.
Keywords/Search Tags:epitope prediction, protein flexibility, overlapping epitope discovery, local overlapping community detection, graph convolutional network
PDF Full Text Request
Related items