Font Size: a A A

Intelligence Algorithms For Protein Structure Prediction And Nucleic Acids Binding Site Annotation

Posted on:2022-03-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:H SuFull Text:PDF
GTID:1480306518998479Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
The structure of a protein determines its function.Understanding the structure of a protein is of great significance for design of protein-targeting small molecule drugs and functional annotation of a protein.However,it is notoriously difficult and costly to solve the protein structure by wet-lab experiments,and the accuracy of existing structure prediction methods still needs to be improved.The nucleic acids always perform their function through interactions with other molecules.Computer-aided drug design can be facilitated by detecting the nucleic acids binding sites that bind other molecules.However,the accuracy of existing corresponding prediction methods is relatively low.Consequently,in this paper,we conducted the research from the following two aspects:1.Protein structure prediction.In this work,we present tr Rosetta2,an improved version of tr Rosetta for protein structure prediction.The major improvement over tr Rosetta consists of two major folds.The first is the application of a new multi-scale network Res2 Net for improved prediction of inter-residue geometries,including distance and orientations.The second is an automated integration of multiple temples into the network to increase the accuracy further,especially for easy/medium targets.Benchmark tests show that tr Rosetta2 improves over tr Rosetta by around 7%.A preliminary version of tr Rosetta2 was tested blindly in the recent CASP14 experiment.It shows that our method generated structure models with an average TM-score 0.67 on 91 targets,which is improved to 0.69 in tr Rosetta2,very close to the top server group(0.71 by Zhang-Server).Further benchmark test on 161 targets from the CAMEO experiments shows that tr Rosetta2 achieves an average TM-score ?0.8,outperforming the top groups.2.Nucleic acids binding site annotation.In this work,we developed two novel methods to predict nucleic acids binding sites.(1)The first one is for prediction of nucleic acids-binding residues,denoted by Nuc Bind.Nuc Bind combines the predictions from a support vector machine-based ab-initio method SVMnuc and a template-based method COACH-D.SVMnuc was trained with features from three complementary sequence profiles.COACH-D predicts the binding residues based on homologous templates identified from a nucleic acids-binding library.The proposed methods were assessed and compared with other peering methods on three benchmark datasets.Experimental results show that Nuc Bind consistently outperforms other stateof-the-art methods.Though with higher accuracy,similar to many other ab-initio methods,cross prediction between DNA and RNA-binding residues was also observed in SVMnuc and Nuc Bind.We attribute the success of Nuc Bind to two folds.The first is the utilization of improved features extracted from three complementary sequence profiles in SVMnuc.The second is the combination of two complementary methods:the ab-initio method SVMnuc and the template-based method COACH-D.(2)The second is to predict small molecule-RNA binding sites using sequence profile-and structure-based descriptors,denoted by RNAsite.And it can identify small moleculebinding sites when RNA structure or only the sequence is available.RNAsite was shown to the competitive with the state-of-the-art methods on the experimental structures of two independent test sets.When predicted structure models were used,RNAsite outperforms other methods by a large margin.The possibility of improving RNAsite by geometry-based binding pocket detection was investigated.The influence of RNA structure's flexibility and the conformational changes caused by ligand binding on RNAsite were also discussed.RNAsite is anticipated to be a useful tool for the design of RNA-targeting small molecule drugs.
Keywords/Search Tags:protein structure prediction, nucleic acids binding sites, protein-nucleic acids binding residues, small molecule-RNA binding sites, deep learning, machine learning, template-based prediction
PDF Full Text Request
Related items