Font Size: a A A

Epitope Discovery Based On Subgraph Expansion

Posted on:2020-05-21Degree:MasterType:Thesis
Country:ChinaCandidate:S G WuFull Text:PDF
GTID:2370330578460820Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Subgraph ext.ension is the process of overlapping clustering of sub-graphs,which is proposed to solve the problem that traditional clustering algorithms can not cluster overlapping clusters.In this paper,the non-overlapping clustering of traditional non-overlapping graph clustering al-gorithm is re-extended by using the structural features of subgraphs and the vertex-edge attributes of subgraphs in the network,so that the final clustering results have overlapping properties.Based on the above ideas,two sub-graph ext,ension models,Glep and GKCE,are proposed in this pa-per.Glep first approximates the structure information of a sub-graph by constructing 14-dimensional graph attributes of non-overlapping clusters,and then optimizes the weight of these 14-dimensional features using term frequency-inverse document frequency technology.Finally,on the basis of optimized 14-dimensional features,the overlapping expansion of non-overlapping clusters is completed based on the stability of 14-dimensional features.GKCE not only utilizes the structural features of subgraphs,but also combines the vertex-edge attributes of subgraphs.The similarity be-tween subgraphs in the process of expansion is analyzed by graph kernel technology,and the overlapping expansion of non-overlapping clusters is completed.Finally,the overlapping clustering results are obtained.Antigen epitopes are part of the amino acid residues on antigens that are cont.acted by antibodies.The discovery of antigen epitopes is to locate those amino acid residues on antigen chains that are contacted by antibod-ies.For an antigen chain,the antigen epitopes can be divided into three cases:single separation,multiple separation and multiple overlap accord-ing to the information of the number difference and location relationship of antigen epitopes on it.At present,most epitope prediction models focus on singles epitope prediction.Few models can predict multiple separat epi-topes,and fewer models can predict multiple overlapping epitopes.Current studies have shown that antigen epitopes have four characteristics:speci-ficity,surface,aggregation and overlap in the spatial structure of antigen chain.These characteristic.s make structure-based prediction model a hot research direction in the field of antigen epitope predictionStarting from the structural network of amino acid residues in antigen chain,two overlapping subgraph clustering models are designed to analyze antigen epitopes in antigen chain from different angles,which can solve the two major problems of single prediction situation and poor prediction effect of most antigen epitope prediction models.The experimental results show that the average F1 score of GKCE,the best overlapping subgraph clustering model,is 67%,81%and 37%higher than that of other similar models in single epitope,multiple separate epitopes and multiple overlap-ping epitopes.
Keywords/Search Tags:subgraph, antigen epitopes, clustering, extension, tf-idf, graph kernel
PDF Full Text Request
Related items