Font Size: a A A

The Design And Calculation Of Features For Protein-nucleic Acid Interaction And The Prediction Of RNA-binding Protein Residues And DNA-binding Proteins

Posted on:2017-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:M J SunFull Text:PDF
GTID:2310330512470212Subject:Pharmacy
Abstract/Summary:PDF Full Text Request
Protein-nucleic acid interactions are involved in many important biological processes, specifically, Protein-RNA interactions play a vital role in the transcription and the post-transcriptional processing of pre-mRNA, the stability and localization of mRNA and translation, and protein-DNA interactions can participate in DNA replication and repair, DNA packaging recombination,chromatin and ribosome formation. With the rapid development of modern biological techniques, more and more information about protein sequences and structures is now available, which facilitate researcher to use bioinformatics approaches to comprehensively analyze the protein-nucleic acid interactions and uncover the potential mechanisms for these interactions and further to develop effective prediction models to predict protein-nucleic acid interactions. With the help of these excellent computational prediction methods, experimentalists can perform an experiment more efficiently.In this study, we firstly widely analyzed the proposed prediction methods for RNA-binding residues in a protein and summarized their advantages and disadvantages, then designed and calculated two novel descriptors, namely residue-level electrostatic surface potential and interface triplet propensity, after the statistical analysis on a large number of protein-RNA complexes. Together with the two newly designed features, we adopted other excellent structure-and sequence-based features to develop a random forest classifer. The area under the receiver operating characteristic curve (AUC) of five-fold cross validation for our method on RBP195 is 0.900, when applying the classifer on test dataset RBP69, the predictin accuracy (ACC) is 0.868. The excellent prediction performance revealed that our method can be helpful for subsequent studies about the prediction of protein-RNA interactions. We also used the two newly designed features in the prediction of DNA-binding proteins (DBPs) and developed an excellent prediction model, the AUC of five-fold cross validation on Dataset456 of the DBP prediction model is 0.985 and the prediction ACC of the prediction model on test dataset Dataset276 is 0.862.
Keywords/Search Tags:Computer-aided drug design, Bioinformatics, protein-RNA interaction, protein-DNA interaction
PDF Full Text Request
Related items