Font Size: a A A

Accurate Sequence-based Prediction On Disordered Flexible Linkers

Posted on:2021-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:Q XingFull Text:PDF
GTID:2480306548482664Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Disordered flexible linkers(DFL)are an important and abundant class of intrinsically disordered regions that serve as connectors between protein domains and structural elements within domains and which facilitate disorder-based allosteric regulation.While computational estimates reveal that thousands proteins have DFLs,they were annotated experimentally in less than 200 proteins.In addition,it is not easy to obtain experimentally annotated DFLs and a lot of manpower,material and financial resources are required.This substantial annotation gap can be reduced with the assistance of accurate computational predictors.The currently sole predictor of DFLs,DFLpred,was designed to trade accuracy for shorter runtime by limiting the use of relevant but computationally-costly predictive inputs and by using local/window-based information while lacking to consider protein-level characteristics.We conceptualize,design and test APOD(Accurate Predictor Of DFLs),the first highly accurate predictor that utilizes both local and protein-level inputs that quantify propensity for disorder,sequence composition,sequence conservation and selected putative structural properties.We demonstrate that the inclusion of all considered input types,including the protein-level information,contributes to our predictive model.Consequently,APOD offers significantly more accurate predictions when compared with its faster predecessor,DFLpred.This can be explained by the application of a more reliable and comprehensive set of inputs and a more sophisticated predictive model,a well-parametrized Support Vector Machine.APOD achieves AUC = 0.82(28% improvement over DFLpred)and MCC = 0.42(180% increase over DFLpred)when tested on an independent/low-similarity test dataset.These high-levels of predictive performance make it a suitable choice for accurate and small-scale predictions of DFLs.This article builds an online network service platform for the APOD method:https://yanglab.nankai.edu.cn/APOD.The user submits the protein in FASTA format and his(her)email to the corresponding forms.After obtaining the APOD prediction result,the result will be sent to the user via email.
Keywords/Search Tags:Disordered Flexible Linker, Sequence Conservation, Structural Property, Disorder Content, Support vector machine(SVM)
PDF Full Text Request
Related items