Font Size: a A A

Studies On The Prediction Of Relative Accessible Surface Area In Alpha Helical Transmembrane Protein Structures

Posted on:2016-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:F XiaoFull Text:PDF
GTID:2180330476453291Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
A transmembrane protein(TMP) is a type of membrane protein(MP) spanning the entirety of the biological membrane to which it is permanently attached. There are two basic types of TMPs: Alpha-helical(TMH) and beta-barrels(TMB). The class of membrane-embedded Alpha-helical, polytopic proteins constitutes the majority of ion channels, transporters, and receptors in living organisms. In the human proteome, it is expected that there are total ~3,000 TMH proteins, which accounts for 25% of the entire human proteome space. In this study, we mainly focus on the TMH proteins due to low number of solved TMB protein structures, which will cause the so-called “small-sample” problem in statistics.In this study, we mainly focus on two problems: rASA prediction and protein-protein binding site prediction in Alpha-helical transmembrane proteins.For rASA prediction, we present a novel, sequence-based method(MemBrain-Rasa) to predict relative solvent accessibility surface area from primary sequences. The MemBrain-Rasa features by a newly developed segment structural similarity-based prediction engine, which is further combined with the machine learning engine. We locally constructed a comprehensive database of residue relative solvent accessibility surface area, which is used to be searched for segments that are expected to be structural similar to the segments on the query sequence. The segment structural similarity-based prediction is then fused with the support vector regression outputs using a designed knowledge rule. Our results show that MemBrain-Rasa is able to achieve a predicted Pearson correlation coefficient(CC) 0.733 and mean absolute error(MAE) 13.593, which are 26.4% and 26.1% better than existing predictors. MemBrain-Rasa represents a new progress for structure modeling of Alpha-helical transmembrane proteins. MemBrain-Rasa is available at: www.csbio.sjtu.edu.cn/bioinf/MemBrain/.For binding residue prediction, we present a novel, sequence-based method to predict protein-protein binding residues from primary protein sequences by machine learning classifiers. We use a support vector regression model to predict relative solvent accessibility by features based on sequences, including position specific scoring matrix, conserved score, z-coordinate prediction, second structure prediction, physical parameter and sequence length. Afterwards, combining features mentioned above with the predicted solvent accessibility, we use ensemble support vector machines to predict protein-protein binding residues. To the best of our knowledge, there is no method to predict protein-protein binding residues in Alpha-helical membrane proteins. Our method outperforms MAdaBoost successfully used in predicting protein-ligand binding residues and random forest used in protein-protein binding residues from surface residues. We also assess the importance of each individual type of features. PSSM profile and conserved score are shown to be more effective to predict protein-protein binding residues in Alpha-helical membrane proteins.
Keywords/Search Tags:Alpha-helical membrane proteins, Relative solvent accessibility surface area, Machine learning, Structure similarity, Binding residue, Predictor
PDF Full Text Request
Related items