Font Size: a A A

Research On Deep Learning-based Antigen-antibody Interaction Sites Prediction

Posted on:2023-07-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:S LuFull Text:PDF
GTID:1524306908462364Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Antibodies are large Y-shaped proteins produced by the immune system that can specifically bind to antigens invading the body.In recent years,antibody drugs have become one of the fastest growing production fields of biological drugs due to their high specificity and high affinity.Antibody drugs occupy a significant share in the global medicine market.The specific binding of antigens and antibodies mainly depends on the interaction of key residues,which are called antigen epitope and antibody paratope,also known as antigen-antibody interaction sites.The identification of epitope and paratope by experimental methods has high cost of labor and long-time consumption.Therefore,rapid prediction using computational methods is of great significance for the development of vaccines and design of antibody drugs.Proteins such as antigens and antibodies can be represented as sequence data and structure data.Sequence data describes the order of residues in a molecule,and structure data contains atom coordinates and other relevant information.However,the existing sequence-based and structure-based prediction methods have limitations such as incomplete analysis of residue binding factors and incomprehensive use of biological knowledge.In this dissertation,deep learning methods for predicting antigen-antibody interaction sites are studied by analyzing the factors affecting the key residues’ binding tendency and utilizing the sequence and structure properties of antigens and antibodies.The main research contents and innovation results include:(1)For the problem that the existing sliding window strategy cannot effectively integrate the features of neighboring residues,a sequence feature representation method SW-ATT based on sliding window strategy and attention mechanism is proposed.Firstly,a weight is dynamically assigned to each neighboring residue using attention mechanism.Then,the "context vector" of the target residue is constructed by the weighted sum of the neighboring residues.Finally,the feature vector of target residue and the"context vector" are combined to update the sequence feature representation.And a residue property prediction method SW-ATTCNN is constructed by combining this method with convolutional neural network.Experimental results show that the proposed method can effectively integrate the features of neighboring residues by dynamically assigning weights based on protein sequences.(2)For the problem that the selection of sliding window size is random and simplex in existing sliding window strategies,a prediction method MW-AGABISP based on multi-window features is proposed.Firstly,multi-window features of target residue are constructed using sliding windows of multiple sizes.Then,convolutional neural network is used to extract deep multi-window features.Finally,recurrent neural network is used to fuse deep multi-window features for classification.Experimental results show that this method can exploit the neighboring residues that affect the binding tendency of target residue by constructing multi-window features and utilizing the biological properties of antigen sequence and antibody sequence.(3)According to the high conservation property in antibodies’ sequence and structure,a paratope prediction method ParSeqSpa fused sequential and spatial neighbors is proposed.Firstly,the target residue features are updated by aggregating the sequential neighbor information through isometric convolutional network unit with residual connection.Then,based on the updated target residue features,the spatial neighbor information is aggregated through the graph convolution network unit to update the target residue features again for prediction.Experimental results show that this method can describe the highly conserved local environment of target residue by aggregating two kinds of neighbor information through connected convolutional neural network and graph convolutional network.It also improves the classification ability on antibody structure data.(4)On the issues of the complex interactions among antigen residues,an epitope prediction method EpiLocalGlobal combining local and global features is proposed.Firstly,the bidirectional long short-term memory network with attention mechanism captures the complex interactions between target residue and distant residues as global features.At the same time,the spatial neighbor information is used to construct the local structure environment of target residue and local features are extracted by the graph convolutional neural network.Finally,the target residue is classified by combining global features and local features.Experimental results show that this method can accurately describe the characteristics of target residue by extracting global features which captures the complex relationships among residues.It also improves the representational capacity of local features on antigen structure data.In summary,a variety of sequence-based and structure-based methods are proposed in this dissertation for epitope prediction and paratope prediction.These methods can discover key residues involved in antigen-antibody interaction and therefore guide biological experiments for the development of vaccines and design of antibody drugs in the future.
Keywords/Search Tags:Epitope, Paratope, Sequence, Structure, Attention Mechanism, Graph Convolutional Network
PDF Full Text Request
Related items