Font Size: a A A

Protein Residues Depth, Flexibility And Function Prediction And Analysis

Posted on:2010-10-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:1110360302457664Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Understanding the protein structure, function, dynamics and their relationship is of great importance when addressing problems in biology. The gap between protein sequence and structure data is exponentially growing and becoming larger with the development of DNA sequencing technique. The commonly accepted hypothesis that protein sequence uniquely determines protein structure enables development of methods for prediction of 3D structure from sequence. Such methods are of substantial value to reduce such gap. Currently, the sequence based three-dimensional (3D) structure prediction is still a challenging task. Therefore, a set of intermediate, more tractable predictions that target various structural aspects, such as residue depth, solvent-accessible surface area, secondary structure, contact number or order, etc., were researched and applied to predict protein structure and function.Although the sequence determines the structure, the protein structure is intrinsically changeable in different environment. Actually, proteins undergo constant thermal fluctuations and other types of motions, which enable the flexibility of structures and adapt to various functions. There exist several methods for studying the protein flexibility, including experiment techniques such as X-ray crystallography and nuclear magnetic resonance, and computation methods such as molecular dynamics simulation, normal mode analysis, elastic network model, weighted contact model and prediction methods using machine learning, whereas all of them have their own disadvantages. Developing new methods could provide alternative insights into the flexibility and dynamics of proteins, and more comprehensive information in function analysis.The current work in this dissertation is composed of three parts:1. Targeting the sequence based prediction of one-dimensional descriptor called residue depth, we proposed a new prediction and optimization method, RDPred, using support vector regression (SVR). It combines information extracted from the sequence, PSI-BLAST scoring matrices, and secondary structure predicted with PSIPRED. With the detailed feature design, selection, and careful parametrization of SVR, our method performed better than the other existing one. At the same time, we also compared the prediction performances among the different residue depth indices, which could be selected and applied to different structure predictors.2. Analyzing the protein flexibility is rather important for understanding the protein functions. We investigate the relationship between the flexibility, expressed with B-factor, and the relative solvent accessibility (RSA) in the context of local, with respect to the sequence, neighborhood and related concepts such as residue depth. We observe that the flexibility of a given residue is strongly influenced by the solvent accessibility of the adjacent neighbors. The mean normalized B-factor of the exposed residues with two buried neighbors is smaller than that of the buried residues with two exposed neighbors. Inclusion of RSA of the neighboring residues (local RSA) significantly increases correlation with the B-factor. Correlation between local RSA and B-factor is shown to be stronger than the correlation that considers local distance/volume based residue depth. We also found that the correlation coefficients between B-factor and RSA for the 20 amino acids, called flexibility-exposure correlation index, are strongly correlated with the stability scale that characterizes the average contributions of each amino acid to the folding stability. Our results reveal that the predicted RSA could be used to distinguish between the disordered and ordered residues and that the inclusion of local predicted RSA values helps in providing a better contrast between these two types of residues. Prediction models developed based on local actual RSA and local predicted RSA show similar or better results in the context of B-factor and disorder predictions when compared with several existing approaches. We validate our models using three case studies, which show that this work provides useful clues for deciphering the structure-flexibility-function relation.3. Any low dimensional descriptor, such as residue depth, solvent accessibility, could not reflect the complete information of protein structure. Therefore, new index may be very useful for studying the protein structure and function in some special aspects. Based on a new measure called gamma radius proposed in [138], we provided an effective algorithm to compute the gamma radius for each residue in protein using Delaunay tessellation and dynamic programming. We believe that this measure could be used to study the functional "pocket" of proteins and describe the binding sites of protein-ligand interactions.
Keywords/Search Tags:protein structure prediction, residue depth, solvent accessibility, Flexibility, B-factor, disordered region, Gamma radius, support vector regression, linear model
PDF Full Text Request
Related items