Font Size: a A A

Computational prediction of protein function for cell cycle kinases and histone methyltransferases from conserved biophysical properties

Posted on:2010-12-05Degree:Ph.DType:Dissertation
University:Columbia UniversityCandidate:Wrzeszczynski, Kazimierz OFull Text:PDF
GTID:1444390002481786Subject:Chemistry
Abstract/Summary:
A rapid accumulation of protein sequences and structures through genomic and structure consortiums has presented researchers with a large number of proteins with none or limited functional annotation. Furthermore the ability to assign specificity to a known and established functional class of proteins and attribute each protein to a particular biological pathway or process is continually an experimental challenge. Therefore the complete annotation of any one protein in the proteome is a multi-step process over decades of laboratory (experimental) science. Our main goal as computational biologists is to understand how well we can alleviate some of this work through computational experiments. Machine learning techniques can classify functionally related proteins where homology-transfer as well as sequence and structure motifs fail. An understanding of the capabilities within computational biology that can discriminate enzymatic function and specificity on a biochemical and cellular level is essential to this goal.;We foremost present a method that aimed at complementing homology-transfer in the identification of cell cycle control kinases from sequence alone. First, we identified functionally significant residues in cell cycle proteins through their high sequence conservation and biophysical properties. We then incorporated these residues and their features into support vector machines (SVM) to identify new kinases and more specifically to differentiate cell cycle kinases from other kinases and other proteins. By using these highly conserved, semi-buried residues and their biophysical properties we could distinguish cell cycle S/T kinases from other kinase families at levels of accuracy and coverage which outperform homology-transfer predictions. An application to the entire human proteome predicted several human proteins with limited previous annotations to be candidates for cell cycle kinases.;We then wanted to better understand the ability of conserved functional residue features to aid in further enzymatic specificity predictions. We set our method on the computational prediction of another type of transferase, the histone methyltransferases. The histone methyltransferases presented a unique classification problem since many of the proteins contain a similar structurally conserved domain. We identify biophysical diversity among the methyltransferase family of proteins and use this diversity in our SVM feature based predictions. We show that conserved biophysical residue features also out perform full sequence features for prediction accuracy in this class of transferases. Furthermore SVM feature based identifications of histone methyltransferases provide higher accuracy and coverage than homology transfer annotations.
Keywords/Search Tags:Histone methyltransferases, Cell cycle, Protein, Biophysical, Computational, Conserved, SVM, Prediction
Related items