Font Size: a A A

Characteristic Analysis And Prediction Of Protein-protein Interactions And Protein Interaction Sites

Posted on:2010-08-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:R LiuFull Text:PDF
GTID:1100360302971120Subject:Bio-IT
Abstract/Summary:PDF Full Text Request
One of the main objectives in the post-genome era is to elucidate the mechanism of protein-protein interactions. Protein-protein interactions are the basis of life activities and play an important role in the metabolic activity of cells. Researching protein-protein interactions is useful to uncover the nature of life activities, to get a better understanding of the mechanism of diseases and to gain important clues for rational drug design. The rapid development of bioinformatics provides us with many effective approaches to understand the mechanism of protein-protein interactions. Reviewing the current research situation of protein-protein interactions, we chose some hot issues to study in this dissertation.Firstly, we chose the transient protein complexes (excluding antigen-antibody complexes) as our research objective and the features of transient interfaces were analyzed. It was indicated that besides the two well-known features, sequence profile and accessible surface area, temperature factor (B-factor) can also reflect the differences between interface and the rest of protein surface. Then, we combined these three features to predict interaction sites in transient complexes using support vector machine (SVM). By conducting cross-validation and independent testing, we found that B-factor plays a key role in identifying interaction sites, and that utilizing the complementarity of the three features is favorable for improving the prediction performance.Secondly, based on the aforementioned work, we tried to identify the interaction sites in antigen-antibody complexes using our previous method. Thus, the original dataset was extended. Through performing prediction on the update dataset, we found that B-factor is an effective indicator for the interface residues in antigen-antibody complexes as well as those in other transient complexes. Additionally, an attempt has been made to develop a post-processing method that was used to reduce the number of false positives recognized by SVM predictors, which can further improve the prediction performance.Thirdly, a representative dataset used by other popular methods was utilized to statistically analyze the features of B-cell discontinuous epitopes. It was found that besides the widely used feature, solvent accessibility, B-factor can also reflect the difference between epitope residues and nonepitope residues. Then, we utilized these two features, in combination with logistic regression model to recognize B-cell discontinuous epitopes. The results indicate that these two features can both be used to identify discontinuous epitopes and the complementarity of them made an important contribution to enhance the prediction performance.Finally, we chose five widely used sequence features, including amino acid composition, the square root of amino acid composition, pseudo amino acid composition, physicochemical properties of amino acids and domain composition, combined with SVM to predict protein-protein interactions. The results show that physicochemical properties of amino acids and domain composition can achieve better performances than the remaining features, and the machine learning algorithm used to predict also plays a critical role in the protein-protein interaction prediction.
Keywords/Search Tags:protein-protein interactions, interface residues, B-cell discontinuous epitopes, temperature factor, support vector machine
PDF Full Text Request
Related items